From patchwork Tue Apr 5 01:46:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12800957 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7FE3C433EF for ; Tue, 5 Apr 2022 01:47:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 43EA56B0074; Mon, 4 Apr 2022 21:47:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3EDE76B0075; Mon, 4 Apr 2022 21:47:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 28FDB6B0078; Mon, 4 Apr 2022 21:47:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0078.hostedemail.com [216.40.44.78]) by kanga.kvack.org (Postfix) with ESMTP id 1BE546B0074 for ; Mon, 4 Apr 2022 21:47:05 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id CFCBE183C8A8A for ; Tue, 5 Apr 2022 01:46:54 +0000 (UTC) X-FDA: 79321136748.22.6E67F5E Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf21.hostedemail.com (Postfix) with ESMTP id 601DC1C001B for ; Tue, 5 Apr 2022 01:46:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649123213; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pI9cWciCmYBnmPhneotwA4BwtHRdfFIYJm9g8S6gDqg=; b=ANeXPtfvSkoIsaGfHHTWrX/Pd4jvw9JUpHoLA3ieGZn2l57+LnXfAV1x0dWhiMXX5n0ML3 QNUoEMGZdusfePphxwpnP70sQF592roTFtwEcNlJtPfX0XuAuVe+/KMe5ldLUCheSksf9O LD7P8aZxG/ZXleDOs89yXRgcqCXpdQM= Received: from mail-io1-f70.google.com (mail-io1-f70.google.com [209.85.166.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-596-vfRGu2b8MZaXleqWD-pPZg-1; Mon, 04 Apr 2022 21:46:53 -0400 X-MC-Unique: vfRGu2b8MZaXleqWD-pPZg-1 Received: by mail-io1-f70.google.com with SMTP id b15-20020a05660214cf00b00648a910b964so7422711iow.19 for ; Mon, 04 Apr 2022 18:46:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=pI9cWciCmYBnmPhneotwA4BwtHRdfFIYJm9g8S6gDqg=; b=7Z2ZcbyWG0z2fWNwsPrh6F6gYoACPT8SDwbGELDGadyZ6HKTHTnKlVb+ZhNcV59cDU CrYMkXnvxDkaiNJax4FlBPsfgClLiHOBWecc/6ZHHW4Y2Bc3UiN49h6AkR6rrZsGELWB 0klm4js5HZOrTJCiRUnTBR+k9q6EQRzZxUcVNU6n7kNiBNXZXzvKuryjoxKf4QaaeG80 wflZYq403rJFrq/9bpuBfY8rqsQNRN87aWt+/Y8siQ0dS0nFecm9PT2owQauvBhuPsAG 6zhXaEqj9P2J7iP5rUZb5UYwIm3DQV6TCuxh6BtqHvkhNrLqdks7o674ndU+uD5rScRM KNLA== X-Gm-Message-State: AOAM530DaF8CYP7gG4fcnmy0/8ZOmWpqZb9/KLDbtaA3vHMIqSs+avCM dHv3D13ZgF6tMQgx4jAJpeucf7zp06pHHBiD8Jkq9GO/74Ufcat8T2TqygI/O+9GqaLQ5rqRBfy m3CXf/9vtpZo= X-Received: by 2002:a92:cdaf:0:b0:2ca:1fe0:333f with SMTP id g15-20020a92cdaf000000b002ca1fe0333fmr562663ild.173.1649123211325; Mon, 04 Apr 2022 18:46:51 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwwDD4n8m40Y7RHhjN5HSEe+GsrzZ5EezHVdV8pG6lXvRYBwaehy8cbB+ODTD7LlpiU1TD9xA== X-Received: by 2002:a92:cdaf:0:b0:2ca:1fe0:333f with SMTP id g15-20020a92cdaf000000b002ca1fe0333fmr562650ild.173.1649123211093; Mon, 04 Apr 2022 18:46:51 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id d14-20020a056602184e00b00649673c175asm7556676ioi.25.2022.04.04.18.46.49 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Apr 2022 18:46:50 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz , Andrew Morton , David Hildenbrand , Matthew Wilcox , peterx@redhat.com, Alistair Popple , Nadav Amit , Axel Rasmussen , Andrea Arcangeli , "Kirill A . Shutemov" , Hugh Dickins , Jerome Glisse , Mike Rapoport Subject: [PATCH v8 01/23] mm: Introduce PTE_MARKER swap entry Date: Mon, 4 Apr 2022 21:46:24 -0400 Message-Id: <20220405014646.13522-2-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220405014646.13522-1-peterx@redhat.com> References: <20220405014646.13522-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 601DC1C001B X-Stat-Signature: pf36tjuw6i15tqr5nzy9fiyyehkfda6g X-Rspam-User: Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ANeXPtfv; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf21.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com X-HE-Tag: 1649123214-137751 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch introduces a new swap entry type called PTE_MARKER. It can be installed for any pte that maps a file-backed memory when the pte is temporarily zapped, so as to maintain per-pte information. The information that kept in the pte is called a "marker". Here we define the marker as "unsigned long" just to match pgoff_t, however it will only work if it still fits in swp_offset(), which is e.g. currently 58 bits on x86_64. A new config CONFIG_PTE_MARKER is introduced too; it's by default off. A bunch of helpers are defined altogether to service the rest of the pte marker code. Signed-off-by: Peter Xu --- include/asm-generic/hugetlb.h | 9 ++++ include/linux/swap.h | 15 ++++++- include/linux/swapops.h | 78 +++++++++++++++++++++++++++++++++++ mm/Kconfig | 6 +++ 4 files changed, 107 insertions(+), 1 deletion(-) diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h index 8e1e6244a89d..f39cad20ffc6 100644 --- a/include/asm-generic/hugetlb.h +++ b/include/asm-generic/hugetlb.h @@ -2,6 +2,9 @@ #ifndef _ASM_GENERIC_HUGETLB_H #define _ASM_GENERIC_HUGETLB_H +#include +#include + static inline pte_t mk_huge_pte(struct page *page, pgprot_t pgprot) { return mk_pte(page, pgprot); @@ -80,6 +83,12 @@ static inline int huge_pte_none(pte_t pte) } #endif +/* Please refer to comments above pte_none_mostly() for the usage */ +static inline int huge_pte_none_mostly(pte_t pte) +{ + return huge_pte_none(pte) || is_pte_marker(pte); +} + #ifndef __HAVE_ARCH_HUGE_PTE_WRPROTECT static inline pte_t huge_pte_wrprotect(pte_t pte) { diff --git a/include/linux/swap.h b/include/linux/swap.h index 7daae5a4b3e1..5553189d0215 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -55,6 +55,19 @@ static inline int current_is_kswapd(void) * actions on faults. */ +/* + * PTE markers are used to persist information onto PTEs that are mapped with + * file-backed memories. As its name "PTE" hints, it should only be applied to + * the leaves of pgtables. + */ +#ifdef CONFIG_PTE_MARKER +#define SWP_PTE_MARKER_NUM 1 +#define SWP_PTE_MARKER (MAX_SWAPFILES + SWP_HWPOISON_NUM + \ + SWP_MIGRATION_NUM + SWP_DEVICE_NUM) +#else +#define SWP_PTE_MARKER_NUM 0 +#endif + /* * Unaddressable device memory support. See include/linux/hmm.h and * Documentation/vm/hmm.rst. Short description is we need struct pages for @@ -107,7 +120,7 @@ static inline int current_is_kswapd(void) #define MAX_SWAPFILES \ ((1 << MAX_SWAPFILES_SHIFT) - SWP_DEVICE_NUM - \ - SWP_MIGRATION_NUM - SWP_HWPOISON_NUM) + SWP_MIGRATION_NUM - SWP_HWPOISON_NUM - SWP_PTE_MARKER_NUM) /* * Magic header for a swap area. The first part of the union is diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 32d517a28969..7a00627845f0 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -274,6 +274,84 @@ static inline int is_readable_migration_entry(swp_entry_t entry) #endif +typedef unsigned long pte_marker; + +#define PTE_MARKER_MASK (0) + +#ifdef CONFIG_PTE_MARKER + +static inline swp_entry_t make_pte_marker_entry(pte_marker marker) +{ + return swp_entry(SWP_PTE_MARKER, marker); +} + +static inline bool is_pte_marker_entry(swp_entry_t entry) +{ + return swp_type(entry) == SWP_PTE_MARKER; +} + +static inline pte_marker pte_marker_get(swp_entry_t entry) +{ + return swp_offset(entry) & PTE_MARKER_MASK; +} + +static inline bool is_pte_marker(pte_t pte) +{ + return is_swap_pte(pte) && is_pte_marker_entry(pte_to_swp_entry(pte)); +} + +#else /* CONFIG_PTE_MARKER */ + +static inline swp_entry_t make_pte_marker_entry(pte_marker marker) +{ + /* This should never be called if !CONFIG_PTE_MARKER */ + WARN_ON_ONCE(1); + return swp_entry(0, 0); +} + +static inline bool is_pte_marker_entry(swp_entry_t entry) +{ + return false; +} + +static inline pte_marker pte_marker_get(swp_entry_t entry) +{ + return 0; +} + +static inline bool is_pte_marker(pte_t pte) +{ + return false; +} + +#endif /* CONFIG_PTE_MARKER */ + +static inline pte_t make_pte_marker(pte_marker marker) +{ + return swp_entry_to_pte(make_pte_marker_entry(marker)); +} + +/* + * This is a special version to check pte_none() just to cover the case when + * the pte is a pte marker. It existed because in many cases the pte marker + * should be seen as a none pte; it's just that we have stored some information + * onto the none pte so it becomes not-none any more. + * + * It should be used when the pte is file-backed, ram-based and backing + * userspace pages, like shmem. It is not needed upon pgtables that do not + * support pte markers at all. For example, it's not needed on anonymous + * memory, kernel-only memory (including when the system is during-boot), + * non-ram based generic file-system. It's fine to be used even there, but the + * extra pte marker check will be pure overhead. + * + * For systems configured with !CONFIG_PTE_MARKER this will be automatically + * optimized to pte_none(). + */ +static inline int pte_none_mostly(pte_t pte) +{ + return pte_none(pte) || is_pte_marker(pte); +} + static inline struct page *pfn_swap_entry_to_page(swp_entry_t entry) { struct page *p = pfn_to_page(swp_offset(entry)); diff --git a/mm/Kconfig b/mm/Kconfig index 034d87953600..a1688b9314b2 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -909,6 +909,12 @@ config ANON_VMA_NAME area from being merged with adjacent virtual memory areas due to the difference in their name. +config PTE_MARKER + bool "Marker PTEs support" + + help + Allows to create marker PTEs for file-backed memory. + source "mm/damon/Kconfig" endmenu From patchwork Tue Apr 5 01:48:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12800958 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA3FEC433EF for ; Tue, 5 Apr 2022 01:49:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 59CBF6B0071; Mon, 4 Apr 2022 21:48:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 54C826B0073; Mon, 4 Apr 2022 21:48:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3C6C76B0075; Mon, 4 Apr 2022 21:48:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0206.hostedemail.com [216.40.44.206]) by kanga.kvack.org (Postfix) with ESMTP id 2D5226B0071 for ; Mon, 4 Apr 2022 21:48:51 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id C77AB183C8A8A for ; Tue, 5 Apr 2022 01:48:40 +0000 (UTC) X-FDA: 79321141200.29.BF01084 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf11.hostedemail.com (Postfix) with ESMTP id 39B2B4001D for ; Tue, 5 Apr 2022 01:48:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649123319; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FzKAaIaqf6EF4behF3lujXRCF51brNcWMl75G/VXVIo=; b=ZIVlKVKFqmDvDu8S/w62kVXqOQ9YbBn6DBZftIwd9bcI7JkBGglGSo33qdz+PEclX400V8 4MnpuR83fYMDIZLoA0Gf2aBIkrAkU0kVpDnCzWNn5gn1Wt5eLVgMqFB4CfStJjxz2QEgL7 sLi49Ye2meqOBw0oBtStNa64v1JwQRw= Received: from mail-io1-f71.google.com (mail-io1-f71.google.com [209.85.166.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-204-pVoEm4WcNMe4fRGSHfDAtw-1; Mon, 04 Apr 2022 21:48:36 -0400 X-MC-Unique: pVoEm4WcNMe4fRGSHfDAtw-1 Received: by mail-io1-f71.google.com with SMTP id x16-20020a6bfe10000000b006409f03e39eso7462055ioh.7 for ; Mon, 04 Apr 2022 18:48:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FzKAaIaqf6EF4behF3lujXRCF51brNcWMl75G/VXVIo=; b=ecEHIOWwkjHlZ2q6EgmztJybnN6n7HU90F6omQRsea8GK1OFZ/AW1OhDxS/tZ6Dsf7 rZyEfZeklMEMCf4uUderPYVWiHx5Z8w4it47qY9aDW79mowVNiEaYvY9zK6l0j7jTDAH LVtfKNMsxgkndUAUZ2p7Xz0VV3VH17T8yWqaubKVAVfe9UHHnauxdWaHy09F78Z5eP1u aOXsukYK3Y1gIFDLGld1eiqU2MhjIGkl/1ungZUlvzTEqxX0k9x5NY8rFOuiF+k0MGXm X+u98oINp58PbDV9BuHgZbaLj0bNPfnahv2rFnZUBTALkRN24MjBVERe/VTpCdXM1tii kCzg== X-Gm-Message-State: AOAM530tBPDNym9QlFLnOkgWf49Sv3Z3dyhTyodac/J4j0HOJ/NbrQ/G OYX+laV7TN4rzBCw3zCtLaH4i9wHBNfOXKzKvloNMJSdRC2zZCEzau8kDVQSYuLTFyjKCwwziH9 xqR4b7yRSlC4= X-Received: by 2002:a05:6e02:1c0a:b0:2c7:75de:d84 with SMTP id l10-20020a056e021c0a00b002c775de0d84mr565297ilh.186.1649123316073; Mon, 04 Apr 2022 18:48:36 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy8Bx++Tjrd/IXRjPJSzct03qjBhI2A5uyYDMnoG1Ghox9iQaiS+UpZyVnd00v9azlJVSYwiQ== X-Received: by 2002:a05:6e02:1c0a:b0:2c7:75de:d84 with SMTP id l10-20020a056e021c0a00b002c775de0d84mr565275ilh.186.1649123315794; Mon, 04 Apr 2022 18:48:35 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id c15-20020a5d8b4f000000b00648f75d0289sm7369921iot.6.2022.04.04.18.48.34 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Apr 2022 18:48:35 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz , Nadav Amit , Matthew Wilcox , Mike Rapoport , David Hildenbrand , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Andrea Arcangeli , Andrew Morton , Axel Rasmussen , Alistair Popple , peterx@redhat.com Subject: [PATCH v8 02/23] mm: Teach core mm about pte markers Date: Mon, 4 Apr 2022 21:48:33 -0400 Message-Id: <20220405014833.14015-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220405014646.13522-1-peterx@redhat.com> References: <20220405014646.13522-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ZIVlKVKF; spf=none (imf11.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Stat-Signature: zoy66ofqp3xskyakwp4pbqr5c4ny6k46 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 39B2B4001D X-HE-Tag: 1649123320-697096 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch still does not use pte marker in any way, however it teaches the core mm about the pte marker idea. For example, handle_pte_marker() is introduced that will parse and handle all the pte marker faults. Many of the places are more about commenting it up - so that we know there's the possibility of pte marker showing up, and why we don't need special code for the cases. Signed-off-by: Peter Xu --- fs/userfaultfd.c | 10 ++++++---- mm/filemap.c | 5 +++++ mm/hmm.c | 2 +- mm/memcontrol.c | 8 ++++++-- mm/memory.c | 23 +++++++++++++++++++++++ mm/mincore.c | 3 ++- mm/mprotect.c | 3 +++ 7 files changed, 46 insertions(+), 8 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index aa0c47cb0d16..8b4a94f5a238 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -249,9 +249,10 @@ static inline bool userfaultfd_huge_must_wait(struct userfaultfd_ctx *ctx, /* * Lockless access: we're in a wait_event so it's ok if it - * changes under us. + * changes under us. PTE markers should be handled the same as none + * ptes here. */ - if (huge_pte_none(pte)) + if (huge_pte_none_mostly(pte)) ret = true; if (!huge_pte_write(pte) && (reason & VM_UFFD_WP)) ret = true; @@ -330,9 +331,10 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx, pte = pte_offset_map(pmd, address); /* * Lockless access: we're in a wait_event so it's ok if it - * changes under us. + * changes under us. PTE markers should be handled the same as none + * ptes here. */ - if (pte_none(*pte)) + if (pte_none_mostly(*pte)) ret = true; if (!pte_write(*pte) && (reason & VM_UFFD_WP)) ret = true; diff --git a/mm/filemap.c b/mm/filemap.c index 3a5ffb5587cd..ef77dae8c28d 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3382,6 +3382,11 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, vmf->pte += xas.xa_index - last_pgoff; last_pgoff = xas.xa_index; + /* + * NOTE: If there're PTE markers, we'll leave them to be + * handled in the specific fault path, and it'll prohibit the + * fault-around logic. + */ if (!pte_none(*vmf->pte)) goto unlock; diff --git a/mm/hmm.c b/mm/hmm.c index af71aac3140e..3fd3242c5e50 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -239,7 +239,7 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr, pte_t pte = *ptep; uint64_t pfn_req_flags = *hmm_pfn; - if (pte_none(pte)) { + if (pte_none_mostly(pte)) { required_fault = hmm_pte_need_fault(hmm_vma_walk, pfn_req_flags, 0); if (required_fault) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 7a08737bac4b..08af97c73f0f 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5644,10 +5644,14 @@ static enum mc_target_type get_mctgt_type(struct vm_area_struct *vma, if (pte_present(ptent)) page = mc_handle_present_pte(vma, addr, ptent); + else if (pte_none_mostly(ptent)) + /* + * PTE markers should be treated as a none pte here, separated + * from other swap handling below. + */ + page = mc_handle_file_pte(vma, addr, ptent); else if (is_swap_pte(ptent)) page = mc_handle_swap_pte(vma, ptent, &ent); - else if (pte_none(ptent)) - page = mc_handle_file_pte(vma, addr, ptent); if (!page && !ent.val) return ret; diff --git a/mm/memory.c b/mm/memory.c index 2c5d1bb4694f..3f396241a7db 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -100,6 +100,8 @@ struct page *mem_map; EXPORT_SYMBOL(mem_map); #endif +static vm_fault_t do_fault(struct vm_fault *vmf); + /* * A number of key systems in x86 including ioremap() rely on the assumption * that high_memory defines the upper bound on direct map memory, then end @@ -1415,6 +1417,8 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, if (!should_zap_page(details, page)) continue; rss[mm_counter(page)]--; + } else if (is_pte_marker_entry(entry)) { + /* By default, simply drop all pte markers when zap */ } else if (is_hwpoison_entry(entry)) { if (!should_zap_cows(details)) continue; @@ -3555,6 +3559,23 @@ static inline bool should_try_to_free_swap(struct page *page, page_count(page) == 2; } +static vm_fault_t handle_pte_marker(struct vm_fault *vmf) +{ + swp_entry_t entry = pte_to_swp_entry(vmf->orig_pte); + unsigned long marker = pte_marker_get(entry); + + /* + * PTE markers should always be with file-backed memories, and the + * marker should never be empty. If anything weird happened, the best + * thing to do is to kill the process along with its mm. + */ + if (WARN_ON_ONCE(vma_is_anonymous(vmf->vma) || !marker)) + return VM_FAULT_SIGBUS; + + /* TODO: handle pte markers */ + return 0; +} + /* * We enter with non-exclusive mmap_lock (to exclude vma changes, * but allow concurrent faults), and pte mapped but not yet locked. @@ -3592,6 +3613,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) ret = vmf->page->pgmap->ops->migrate_to_ram(vmf); } else if (is_hwpoison_entry(entry)) { ret = VM_FAULT_HWPOISON; + } else if (is_pte_marker_entry(entry)) { + ret = handle_pte_marker(vmf); } else { print_bad_pte(vma, vmf->address, vmf->orig_pte, NULL); ret = VM_FAULT_SIGBUS; diff --git a/mm/mincore.c b/mm/mincore.c index f4f627325e12..fa200c14185f 100644 --- a/mm/mincore.c +++ b/mm/mincore.c @@ -122,7 +122,8 @@ static int mincore_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, for (; addr != end; ptep++, addr += PAGE_SIZE) { pte_t pte = *ptep; - if (pte_none(pte)) + /* We need to do cache lookup too for pte markers */ + if (pte_none_mostly(pte)) __mincore_unmapped_range(addr, addr + PAGE_SIZE, vma, vec); else if (pte_present(pte)) diff --git a/mm/mprotect.c b/mm/mprotect.c index 56060acdabd3..709a6f73b764 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -188,6 +188,9 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, newpte = pte_swp_mksoft_dirty(newpte); if (pte_swp_uffd_wp(oldpte)) newpte = pte_swp_mkuffd_wp(newpte); + } else if (is_pte_marker_entry(entry)) { + /* Skip it, the same as none pte */ + continue; } else { newpte = oldpte; } From patchwork Tue Apr 5 01:48:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12800959 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F02FBC433F5 for ; Tue, 5 Apr 2022 01:49:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DDA686B0073; Mon, 4 Apr 2022 21:48:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D89806B0075; Mon, 4 Apr 2022 21:48:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C045D6B0078; Mon, 4 Apr 2022 21:48:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.25]) by kanga.kvack.org (Postfix) with ESMTP id B1B6A6B0073 for ; Mon, 4 Apr 2022 21:48:52 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 8A1C523861 for ; Tue, 5 Apr 2022 01:48:42 +0000 (UTC) X-FDA: 79321141284.15.A95C836 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf01.hostedemail.com (Postfix) with ESMTP id 0F96E40036 for ; Tue, 5 Apr 2022 01:48:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649123320; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sVIJ8cJ4ZYWLy9njZqjP9dBoADUUbduMAUK9CqZAoa8=; b=b90Rgh8dHIXO22+Vrad1/EkM+DtOL72P/a5YIiBcJQ0IeQUHbYLCj7eEja2qNkc1bag5iJ DQ2XvXB4YUttO9/VIiVqRrje/pkedUnGM8u3YL7ERFZQ6BSwMqitH3vcZWYSrGzesbODX6 X3VEVmkXE0oyvIiapnuztb26+etW4nY= Received: from mail-io1-f69.google.com (mail-io1-f69.google.com [209.85.166.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-307-6WqGl_DNOa60UthPi97f1A-1; Mon, 04 Apr 2022 21:48:39 -0400 X-MC-Unique: 6WqGl_DNOa60UthPi97f1A-1 Received: by mail-io1-f69.google.com with SMTP id u18-20020a5d8712000000b0064c7a7c497aso7416268iom.18 for ; Mon, 04 Apr 2022 18:48:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=sVIJ8cJ4ZYWLy9njZqjP9dBoADUUbduMAUK9CqZAoa8=; b=TgsoH093Xqqcq4s4lAHQIfqhsULC64wFeu2C6B4xPraWFD+wCBScEPA7AcRi2vSm0b fiGBSv399sr2x45jI2TBDAhhNTGUAIdm1g49gqal986hPMFMRLSEolHZRXn7vtsMolXt c0bSaMlfT8Twq6efPWed0aCMwcL9p02v3ZgdH2NSe+r0lqG9Zux9RJT0EDaDHP0D4I4K tX1O9uBTI19P5C7WiR+Zn+wMeum2NHPQlCngzikMcp67GbWaJWapTXaN8sYGFuvun4aQ tGrO1Wralnsnz+PwROC6ghWKSDtWVzJJSVu3qFinY2IgzQAzCKAJg38gt8o/G7sp9v/m TvMA== X-Gm-Message-State: AOAM533ibvVGS4OkHxeuAhqF+SeWox1UK+IG8k8JlibFChWcCgmXpxBK vlVjos7OwxKWABk9HFuXGgAjjC0aQufksV1lX5rVfvHbWE7J0a/O5kXTQsN3lpru2yox1LKCQ7Z Gf59/8oblk5A= X-Received: by 2002:a92:cc41:0:b0:2ca:317d:1545 with SMTP id t1-20020a92cc41000000b002ca317d1545mr610363ilq.97.1649123318850; Mon, 04 Apr 2022 18:48:38 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw4r+oclsvcEsUiRUe9DMRABSCkkiRtx/NYkAy387pbnwUItvkV/gFFJ6K8jTgQhR+UEusSug== X-Received: by 2002:a92:cc41:0:b0:2ca:317d:1545 with SMTP id t1-20020a92cc41000000b002ca317d1545mr610343ilq.97.1649123318645; Mon, 04 Apr 2022 18:48:38 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id s5-20020a056602168500b0064c82210ce4sm7650542iow.13.2022.04.04.18.48.37 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Apr 2022 18:48:38 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz , Nadav Amit , Matthew Wilcox , Mike Rapoport , David Hildenbrand , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Andrea Arcangeli , Andrew Morton , Axel Rasmussen , Alistair Popple , peterx@redhat.com Subject: [PATCH v8 03/23] mm: Check against orig_pte for finish_fault() Date: Mon, 4 Apr 2022 21:48:36 -0400 Message-Id: <20220405014836.14077-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220405014646.13522-1-peterx@redhat.com> References: <20220405014646.13522-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: j67j8z6rwjgnmsbaagn3gw67wrqof6ht Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=b90Rgh8d; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf01.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com X-Rspamd-Queue-Id: 0F96E40036 X-HE-Tag: 1649123320-588617 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We used to check against none pte in finish_fault(), with the assumption that the orig_pte is always none pte. This change prepares us to be able to call do_fault() on !none ptes. For example, we should allow that to happen for pte marker so that we can restore information out of the pte markers. Let's change the "pte_none" check into detecting changes since we fetched orig_pte. One trivial thing to take care of here is, when pmd==NULL for the pgtable we may not initialize orig_pte at all in handle_pte_fault(). By default orig_pte will be all zeros however the problem is not all architectures are using all-zeros for a none pte. pte_clear() will be the right thing to use here so that we'll always have a valid orig_pte value for the whole handle_pte_fault() call. Signed-off-by: Peter Xu Reviewed-by: Alistair Popple Reported-by: Marek Szyprowski Tested-by: Marek Szyprowski --- mm/memory.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/mm/memory.c b/mm/memory.c index 3f396241a7db..b1af996b09ca 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4241,7 +4241,7 @@ vm_fault_t finish_fault(struct vm_fault *vmf) vmf->address, &vmf->ptl); ret = 0; /* Re-check under ptl */ - if (likely(pte_none(*vmf->pte))) + if (likely(pte_same(*vmf->pte, vmf->orig_pte))) do_set_pte(vmf, page, vmf->address); else ret = VM_FAULT_NOPAGE; @@ -4709,6 +4709,13 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) * concurrent faults and from rmap lookups. */ vmf->pte = NULL; + /* + * Always initialize orig_pte. This matches with below + * code to have orig_pte to be the none pte if pte==NULL. + * This makes the rest code to be always safe to reference + * it, e.g. in finish_fault() we'll detect pte changes. + */ + pte_clear(vmf->vma->vm_mm, vmf->address, &vmf->orig_pte); } else { /* * If a huge pmd materialized under us just retry later. Use From patchwork Tue Apr 5 01:48:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12800960 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BB59C433EF for ; Tue, 5 Apr 2022 01:49:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8ABD46B0075; Mon, 4 Apr 2022 21:48:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 859956B0078; Mon, 4 Apr 2022 21:48:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6FA996B007B; Mon, 4 Apr 2022 21:48:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0019.hostedemail.com [216.40.44.19]) by kanga.kvack.org (Postfix) with ESMTP id 62D506B0075 for ; Mon, 4 Apr 2022 21:48:54 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 232A98249980 for ; Tue, 5 Apr 2022 01:48:44 +0000 (UTC) X-FDA: 79321141326.20.DD264D8 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf11.hostedemail.com (Postfix) with ESMTP id 98BED40009 for ; Tue, 5 Apr 2022 01:48:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649123323; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=U8n+4P5JjRaDUHKZYY8oH8Iy0i6vYl4G2VE7hAdGm3Q=; b=gNTy7bOb/elG58f1Is8sENvDQEyyaX3Q8vxBYP+bHo/SKecV3NLYcnVba6fNhrGiaABgcV 4+yvhhD98OEwfZm10d0RraQGzJ3wpn38xwRmVClC6AAUE4AkDAsEjRqA5quVWze/AAlCe3 0S8A0Qrz62jioAmk3YW6AuguDlL5DJc= Received: from mail-io1-f69.google.com (mail-io1-f69.google.com [209.85.166.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-423-FVSJfvcZNrajefsPreim0Q-1; Mon, 04 Apr 2022 21:48:42 -0400 X-MC-Unique: FVSJfvcZNrajefsPreim0Q-1 Received: by mail-io1-f69.google.com with SMTP id a9-20020a5d89c9000000b0064cb68a9ba6so7441102iot.11 for ; Mon, 04 Apr 2022 18:48:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=U8n+4P5JjRaDUHKZYY8oH8Iy0i6vYl4G2VE7hAdGm3Q=; b=wEe7/dXN4UeMk1rBgDiCqKrbwnbQfiu6jc92yMMEzzCPCQR+SAnlx+6o1c8skLHwrS CwtosCODlwZNh5agNELkoiuSWOD/65iQuU7yS86+FLCHfdaV69GJ0qsWoic7yyDpwfXQ 3KX26B65ADZz/km69iWX7b1pbRBO267K1qECVp4GvI6AeImMd61EwCz6/V7QY7X+iB4Q Y4dZ4WnysTliM0ywFKYnZRaqrgaFJ4J1Q2ACM9wLJ9Sglw5kFU+5PitfWqXuO5TQPUB3 3oEMFlF6n5k+/ItmS3PF3cmtuu+4dgztXLu2Kye4CSLS9qB0k+80sU8YHmx3vJvTIRf+ xNoA== X-Gm-Message-State: AOAM530CUef32OgDn9XMOOVRCa4AgHbUlXERV+LPHJTHjMUqGlSyQXYm ciyl05FOHP2+jYSChXgl70NG+lAm+tjA9a2nyDQ/rep1ED4Q9sSCCGjelKTZg8oYp3R6Sad2ohV YO5FlYEB0VWQ= X-Received: by 2002:a05:6602:490:b0:638:c8ed:1e38 with SMTP id y16-20020a056602049000b00638c8ed1e38mr589228iov.202.1649123321664; Mon, 04 Apr 2022 18:48:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzBgvwBVvGE1Vqojm5+pnVWt3lmiOe9Zr6Q3Q9kFVvDL2PkJIvmVaEpRNWmsAlmIdguH9c/+A== X-Received: by 2002:a05:6602:490:b0:638:c8ed:1e38 with SMTP id y16-20020a056602049000b00638c8ed1e38mr589217iov.202.1649123321422; Mon, 04 Apr 2022 18:48:41 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id s5-20020a056602168500b0064c82210ce4sm7650607iow.13.2022.04.04.18.48.40 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Apr 2022 18:48:41 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz , Nadav Amit , Matthew Wilcox , Mike Rapoport , David Hildenbrand , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Andrea Arcangeli , Andrew Morton , Axel Rasmussen , Alistair Popple , peterx@redhat.com Subject: [PATCH v8 04/23] mm/uffd: PTE_MARKER_UFFD_WP Date: Mon, 4 Apr 2022 21:48:38 -0400 Message-Id: <20220405014838.14131-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220405014646.13522-1-peterx@redhat.com> References: <20220405014646.13522-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=gNTy7bOb; spf=none (imf11.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 98BED40009 X-Stat-Signature: rg3gy73ureq5u5zj7siotxga3opnjd46 X-HE-Tag: 1649123323-764163 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch introduces the 1st user of pte marker: the uffd-wp marker. When the pte marker is installed with the uffd-wp bit set, it means this pte was wr-protected by uffd. We will use this special pte to arm the ptes that got either unmapped or swapped out for a file-backed region that was previously wr-protected. This special pte could trigger a page fault just like swap entries. This idea is greatly inspired by Hugh and Andrea in the discussion, which is referenced in the links below. Some helpers are introduced to detect whether a swap pte is uffd wr-protected. After the pte marker introduced, one swap pte can be wr-protected in two forms: either it is a normal swap pte and it has _PAGE_SWP_UFFD_WP set, or it's a pte marker that has PTE_MARKER_UFFD_WP set. Link: https://lore.kernel.org/lkml/20201126222359.8120-1-peterx@redhat.com/ Link: https://lore.kernel.org/lkml/20201130230603.46187-1-peterx@redhat.com/ Suggested-by: Andrea Arcangeli Suggested-by: Hugh Dickins Signed-off-by: Peter Xu Reported-by: kernel test robot --- include/linux/swapops.h | 3 ++- include/linux/userfaultfd_k.h | 43 +++++++++++++++++++++++++++++++++++ mm/Kconfig | 9 ++++++++ 3 files changed, 54 insertions(+), 1 deletion(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 7a00627845f0..fffbba0036f6 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -276,7 +276,8 @@ static inline int is_readable_migration_entry(swp_entry_t entry) typedef unsigned long pte_marker; -#define PTE_MARKER_MASK (0) +#define PTE_MARKER_UFFD_WP BIT(0) +#define PTE_MARKER_MASK (PTE_MARKER_UFFD_WP) #ifdef CONFIG_PTE_MARKER diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 33cea484d1ad..bd09c3c89b59 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -15,6 +15,8 @@ #include #include +#include +#include #include /* The set of all possible UFFD-related VM flags. */ @@ -236,4 +238,45 @@ static inline void userfaultfd_unmap_complete(struct mm_struct *mm, #endif /* CONFIG_USERFAULTFD */ +static inline bool pte_marker_entry_uffd_wp(swp_entry_t entry) +{ + return is_pte_marker_entry(entry) && + (pte_marker_get(entry) & PTE_MARKER_UFFD_WP); +} + +static inline bool pte_marker_uffd_wp(pte_t pte) +{ +#ifdef CONFIG_PTE_MARKER_UFFD_WP + swp_entry_t entry; + + if (!is_swap_pte(pte)) + return false; + + entry = pte_to_swp_entry(pte); + + return pte_marker_entry_uffd_wp(entry); +#else + return false; +#endif +} + +/* + * Returns true if this is a swap pte and was uffd-wp wr-protected in either + * forms (pte marker or a normal swap pte), false otherwise. + */ +static inline bool pte_swp_uffd_wp_any(pte_t pte) +{ +#ifdef CONFIG_PTE_MARKER_UFFD_WP + if (!is_swap_pte(pte)) + return false; + + if (pte_swp_uffd_wp(pte)) + return true; + + if (pte_marker_uffd_wp(pte)) + return true; +#endif + return false; +} + #endif /* _LINUX_USERFAULTFD_K_H */ diff --git a/mm/Kconfig b/mm/Kconfig index a1688b9314b2..6e7c2d59fa96 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -915,6 +915,15 @@ config PTE_MARKER help Allows to create marker PTEs for file-backed memory. +config PTE_MARKER_UFFD_WP + bool "Marker PTEs support for userfaultfd write protection" + depends on PTE_MARKER && HAVE_ARCH_USERFAULTFD_WP + + help + Allows to create marker PTEs for userfaultfd write protection + purposes. It is required to enable userfaultfd write protection on + file-backed memory types like shmem and hugetlbfs. + source "mm/damon/Kconfig" endmenu From patchwork Tue Apr 5 01:48:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12800961 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8EA96C433F5 for ; Tue, 5 Apr 2022 01:50:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4D9266B0074; Mon, 4 Apr 2022 21:48:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 488F86B0078; Mon, 4 Apr 2022 21:48:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 350BE6B007B; Mon, 4 Apr 2022 21:48:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.26]) by kanga.kvack.org (Postfix) with ESMTP id 28ACD6B0074 for ; Mon, 4 Apr 2022 21:48:57 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay11.hostedemail.com (Postfix) with ESMTP id 0CA4A80942 for ; Tue, 5 Apr 2022 01:48:47 +0000 (UTC) X-FDA: 79321141494.11.8CD638A Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf17.hostedemail.com (Postfix) with ESMTP id 6C0764002E for ; Tue, 5 Apr 2022 01:48:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649123326; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=U7N++YETabzz5IcvkFNQ6jfBTPxFyQ6g10gwAR5sfFo=; b=FSASRzSia783uJ+4TvkkkzXdC0fjMQPBAjTyOA0tpmlwvlTRZcJ6wP2LWKzl/b/Hqr03eq ROB0JQr31O+bE5Mh8I0IT1PLlk8XyuTq9jlIkKMF86rmM25cC2v51Y+dRh0LsqvxNxSosv zxV11vRqF1OFqvuIAJR7QWmj+zHkUPo= Received: from mail-il1-f199.google.com (mail-il1-f199.google.com [209.85.166.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-403-G9SBHhaFNbKTcVCV7l8nLw-1; Mon, 04 Apr 2022 21:48:45 -0400 X-MC-Unique: G9SBHhaFNbKTcVCV7l8nLw-1 Received: by mail-il1-f199.google.com with SMTP id r16-20020a056e02109000b002ca35f87493so3272833ilj.22 for ; Mon, 04 Apr 2022 18:48:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=U7N++YETabzz5IcvkFNQ6jfBTPxFyQ6g10gwAR5sfFo=; b=lkbQf6pNuihGmd70R89VXoqYsvoWjIUUrgxelRQ/5vmaHFiRegVrzzL8K2CWo0pExf GFLq/sR76h5S4GgqTcMk0ROc705ebH14Exhdyj/H/DVzNZf614B33VztqewYJkf++UPL l1L9TkUA5A5f1ArjDMWSrFa5/jWe8YLGHqpuKk9JgV//LMDXvAlDC1UW7W0Np1loiaOU FnofHk3W6y7YivWkJe24jCKmZHTI5BgSMlixGHBDtp+97KxzAHJLyb9oTGBJpu8xZU/V oqqbE7reZNyUnvmDfcz1jLUEH4p04xfAvHct6rz1r8ZAo12ADr+uJ5RNDE47pWxtNY8i Q+lw== X-Gm-Message-State: AOAM53354XbjTo3M/Xbe3OKSvFlLDZczg1BvlTkfytiBeYEQyAFdC4rV NOR9VTyvqBvoLF2wkvTXEqeQdXbsjbccnJ10IT5V0xSqfrW9/C2gVqTF+KIsSgE0fuRgGZf0maZ 0lG/AlhRh3pI= X-Received: by 2002:a02:ccdb:0:b0:321:2cf8:8c70 with SMTP id k27-20020a02ccdb000000b003212cf88c70mr736086jaq.32.1649123324364; Mon, 04 Apr 2022 18:48:44 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwdVvGICqrYSIFeMsXS/EOPSh7BPG/ifQSrmjaNr8HpCpVJTxhXefsGedxvCcLNau5XeR/6Pg== X-Received: by 2002:a02:ccdb:0:b0:321:2cf8:8c70 with SMTP id k27-20020a02ccdb000000b003212cf88c70mr736070jaq.32.1649123324113; Mon, 04 Apr 2022 18:48:44 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id ay18-20020a5d9d92000000b0064c77f6aaecsm7925169iob.3.2022.04.04.18.48.42 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Apr 2022 18:48:43 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz , Nadav Amit , Matthew Wilcox , Mike Rapoport , David Hildenbrand , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Andrea Arcangeli , Andrew Morton , Axel Rasmussen , Alistair Popple , peterx@redhat.com Subject: [PATCH v8 05/23] mm/shmem: Take care of UFFDIO_COPY_MODE_WP Date: Mon, 4 Apr 2022 21:48:41 -0400 Message-Id: <20220405014841.14185-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220405014646.13522-1-peterx@redhat.com> References: <20220405014646.13522-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=FSASRzSi; spf=none (imf17.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Stat-Signature: 3rfk9wb8ryj33rw7pa7up8xy5fryfaep X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 6C0764002E X-HE-Tag: 1649123326-883986 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Pass wp_copy into shmem_mfill_atomic_pte() through the stack, then apply the UFFD_WP bit properly when the UFFDIO_COPY on shmem is with UFFDIO_COPY_MODE_WP. wp_copy lands mfill_atomic_install_pte() finally. Note: we must do pte_wrprotect() if !writable in mfill_atomic_install_pte(), as mk_pte() could return a writable pte (e.g., when VM_SHARED on a shmem file). Signed-off-by: Peter Xu --- include/linux/shmem_fs.h | 4 ++-- mm/shmem.c | 4 ++-- mm/userfaultfd.c | 23 ++++++++++++++++++----- 3 files changed, 22 insertions(+), 9 deletions(-) diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index 3e915cc550bc..a68f982f22d1 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -145,11 +145,11 @@ extern int shmem_mfill_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, struct vm_area_struct *dst_vma, unsigned long dst_addr, unsigned long src_addr, - bool zeropage, + bool zeropage, bool wp_copy, struct page **pagep); #else /* !CONFIG_SHMEM */ #define shmem_mfill_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, \ - src_addr, zeropage, pagep) ({ BUG(); 0; }) + src_addr, zeropage, wp_copy, pagep) ({ BUG(); 0; }) #endif /* CONFIG_SHMEM */ #endif /* CONFIG_USERFAULTFD */ diff --git a/mm/shmem.c b/mm/shmem.c index 7004c7f55716..9efb8a96d75e 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2319,7 +2319,7 @@ int shmem_mfill_atomic_pte(struct mm_struct *dst_mm, struct vm_area_struct *dst_vma, unsigned long dst_addr, unsigned long src_addr, - bool zeropage, + bool zeropage, bool wp_copy, struct page **pagep) { struct inode *inode = file_inode(dst_vma->vm_file); @@ -2392,7 +2392,7 @@ int shmem_mfill_atomic_pte(struct mm_struct *dst_mm, goto out_release; ret = mfill_atomic_install_pte(dst_mm, dst_pmd, dst_vma, dst_addr, - page, true, false); + page, true, wp_copy); if (ret) goto out_delete_from_cache; diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index dae25d985d15..b1c875b77fbb 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -77,10 +77,19 @@ int mfill_atomic_install_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, * Always mark a PTE as write-protected when needed, regardless of * VM_WRITE, which the user might change. */ - if (wp_copy) + if (wp_copy) { _dst_pte = pte_mkuffd_wp(_dst_pte); - else if (writable) + writable = false; + } + + if (writable) _dst_pte = pte_mkwrite(_dst_pte); + else + /* + * We need this to make sure write bit removed; as mk_pte() + * could return a pte with write bit set. + */ + _dst_pte = pte_wrprotect(_dst_pte); dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl); @@ -95,7 +104,12 @@ int mfill_atomic_install_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd, } ret = -EEXIST; - if (!pte_none(*dst_pte)) + /* + * We allow to overwrite a pte marker: consider when both MISSING|WP + * registered, we firstly wr-protect a none pte which has no page cache + * page backing it, then access the page. + */ + if (!pte_none_mostly(*dst_pte)) goto out_unlock; if (page_in_cache) { @@ -479,11 +493,10 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm, err = mfill_zeropage_pte(dst_mm, dst_pmd, dst_vma, dst_addr); } else { - VM_WARN_ON_ONCE(wp_copy); err = shmem_mfill_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, src_addr, mode != MCOPY_ATOMIC_NORMAL, - page); + wp_copy, page); } return err; From patchwork Tue Apr 5 01:48:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12800962 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 61402C433FE for ; Tue, 5 Apr 2022 01:50:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 948686B0078; Mon, 4 Apr 2022 21:49:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8F7656B007B; Mon, 4 Apr 2022 21:49:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 76F746B007D; Mon, 4 Apr 2022 21:49:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.27]) by kanga.kvack.org (Postfix) with ESMTP id 6A6486B0078 for ; Mon, 4 Apr 2022 21:49:01 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 39A2C60966 for ; Tue, 5 Apr 2022 01:48:51 +0000 (UTC) X-FDA: 79321141662.03.9516701 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf21.hostedemail.com (Postfix) with ESMTP id B52B61C001B for ; Tue, 5 Apr 2022 01:48:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649123329; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Fw6KWLeUFN89OSB0e6nh5J0fTPBfve4te3QhwRo4Loc=; b=Vt1uLlY0tsqrZX4Bgqaf3L+pvomdeF+Ko24wZKS8kxMaFkLK9Px/g4cITAni+qjB/FiB98 55JhpGFRYliPRGKLzB9gsHsKk044geQZeB9wQkBUrjuZQEoP6Zp8ZW53SGRjuYiwAPPgCk JWiRav7uXZ9S8HuOVX8gyBkrkx18qj8= Received: from mail-il1-f197.google.com (mail-il1-f197.google.com [209.85.166.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-509-drMWS4ghPJyHACErM7HlLQ-1; Mon, 04 Apr 2022 21:48:48 -0400 X-MC-Unique: drMWS4ghPJyHACErM7HlLQ-1 Received: by mail-il1-f197.google.com with SMTP id m3-20020a056e02158300b002b6e3d1f97cso7181767ilu.19 for ; Mon, 04 Apr 2022 18:48:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Fw6KWLeUFN89OSB0e6nh5J0fTPBfve4te3QhwRo4Loc=; b=X/d1UUHy13KB35q4SEDX9UbRJbX/LqpbSTxbpfEZlSblFdUFLIZoudqNsE9TiIdBaL jCD+ttPV6PuCNX9ncdquns2HzYsgMgumsJto5FKctPZG23vwHIg5uMw2RCrpnRMkFfZ1 ChK3ojxqPZDbUos/1RGiZw2Q1o8q6wlkbUWS0sLVH3H/bMFdD/lW/zi8KvIyfPyiZZBA M4XzQeCpDhdcjb6G3Z0J8mOKRfajxrxeKZF6A+eDcfQKUBSasUduQbfy1na1g7oYJKe8 NCJRZrxc7UK1AuFqb1waP/OIiU/VNZf3OYLTrXuClIXD69xLZ5fNE97/E1OnD05rcUPv Pn1g== X-Gm-Message-State: AOAM5332cQZtzy5gPsBZzoOFwqCBxCiHpjNKLLmEkMvRA7GFgP1GtDCj B7iVii+qakCg4xR/i5oT1go8+qxQwfeUIHDXJeRC6nf7wwqr7Ch4iKL8ZHcovHaqhThVWkPUxrW KXjsbITaC9pA= X-Received: by 2002:a92:6c0c:0:b0:2c7:ace3:7ecc with SMTP id h12-20020a926c0c000000b002c7ace37eccmr567920ilc.124.1649123327427; Mon, 04 Apr 2022 18:48:47 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx84rejRE/kakXdjuASUjJL9LRrkKumCkwNt6LzBT3SqnMgqMybL7GwNEjDRDEuJ5YeMeE60w== X-Received: by 2002:a92:6c0c:0:b0:2c7:ace3:7ecc with SMTP id h12-20020a926c0c000000b002c7ace37eccmr567899ilc.124.1649123327154; Mon, 04 Apr 2022 18:48:47 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id m4-20020a0566022e8400b006463059bf2fsm7314659iow.49.2022.04.04.18.48.45 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Apr 2022 18:48:46 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz , Nadav Amit , Matthew Wilcox , Mike Rapoport , David Hildenbrand , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Andrea Arcangeli , Andrew Morton , Axel Rasmussen , Alistair Popple , peterx@redhat.com Subject: [PATCH v8 06/23] mm/shmem: Handle uffd-wp special pte in page fault handler Date: Mon, 4 Apr 2022 21:48:44 -0400 Message-Id: <20220405014844.14239-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220405014646.13522-1-peterx@redhat.com> References: <20220405014646.13522-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Stat-Signature: hkb8nweq4n9csafis179ptakt754ngb9 Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Vt1uLlY0; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf21.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: B52B61C001B X-HE-Tag: 1649123329-975857 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: File-backed memories are prone to unmap/swap so the ptes are always unstable, because they can be easily faulted back later using the page cache. This could lead to uffd-wp getting lost when unmapping or swapping out such memory. One example is shmem. PTE markers are needed to store those information. This patch prepares it by handling uffd-wp pte markers first it is applied elsewhere, so that the page fault handler can recognize uffd-wp pte markers. The handling of uffd-wp pte markers is similar to missing fault, it's just that we'll handle this "missing fault" when we see the pte markers, meanwhile we need to make sure the marker information is kept during processing the fault. This is a slow path of uffd-wp handling, because zapping of wr-protected shmem ptes should be rare. So far it should only trigger in two conditions: (1) When trying to punch holes in shmem_fallocate(), there is an optimization to zap the pgtables before evicting the page. (2) When swapping out shmem pages. Because of this, the page fault handling is simplifed too by not sending the wr-protect message in the 1st page fault, instead the page will be installed read-only, so the uffd-wp message will be generated in the next fault, which will trigger the do_wp_page() path of general uffd-wp handling. Disable fault-around for all uffd-wp registered ranges for extra safety just like uffd-minor fault, and clean the code up. Signed-off-by: Peter Xu --- include/linux/userfaultfd_k.h | 17 +++++++++ mm/memory.c | 67 ++++++++++++++++++++++++++++++----- 2 files changed, 75 insertions(+), 9 deletions(-) diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index bd09c3c89b59..827e38b7be65 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -96,6 +96,18 @@ static inline bool uffd_disable_huge_pmd_share(struct vm_area_struct *vma) return vma->vm_flags & (VM_UFFD_WP | VM_UFFD_MINOR); } +/* + * Don't do fault around for either WP or MINOR registered uffd range. For + * MINOR registered range, fault around will be a total disaster and ptes can + * be installed without notifications; for WP it should mostly be fine as long + * as the fault around checks for pte_none() before the installation, however + * to be super safe we just forbid it. + */ +static inline bool uffd_disable_fault_around(struct vm_area_struct *vma) +{ + return vma->vm_flags & (VM_UFFD_WP | VM_UFFD_MINOR); +} + static inline bool userfaultfd_missing(struct vm_area_struct *vma) { return vma->vm_flags & VM_UFFD_MISSING; @@ -236,6 +248,11 @@ static inline void userfaultfd_unmap_complete(struct mm_struct *mm, { } +static inline bool uffd_disable_fault_around(struct vm_area_struct *vma) +{ + return false; +} + #endif /* CONFIG_USERFAULTFD */ static inline bool pte_marker_entry_uffd_wp(swp_entry_t entry) diff --git a/mm/memory.c b/mm/memory.c index b1af996b09ca..21abb8a30553 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3559,6 +3559,39 @@ static inline bool should_try_to_free_swap(struct page *page, page_count(page) == 2; } +static vm_fault_t pte_marker_clear(struct vm_fault *vmf) +{ + vmf->pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd, + vmf->address, &vmf->ptl); + /* + * Be careful so that we will only recover a special uffd-wp pte into a + * none pte. Otherwise it means the pte could have changed, so retry. + */ + if (is_pte_marker(*vmf->pte)) + pte_clear(vmf->vma->vm_mm, vmf->address, vmf->pte); + pte_unmap_unlock(vmf->pte, vmf->ptl); + return 0; +} + +/* + * This is actually a page-missing access, but with uffd-wp special pte + * installed. It means this pte was wr-protected before being unmapped. + */ +static vm_fault_t pte_marker_handle_uffd_wp(struct vm_fault *vmf) +{ + /* + * Just in case there're leftover special ptes even after the region + * got unregistered - we can simply clear them. We can also do that + * proactively when e.g. when we do UFFDIO_UNREGISTER upon some uffd-wp + * ranges, but it should be more efficient to be done lazily here. + */ + if (unlikely(!userfaultfd_wp(vmf->vma) || vma_is_anonymous(vmf->vma))) + return pte_marker_clear(vmf); + + /* do_fault() can handle pte markers too like none pte */ + return do_fault(vmf); +} + static vm_fault_t handle_pte_marker(struct vm_fault *vmf) { swp_entry_t entry = pte_to_swp_entry(vmf->orig_pte); @@ -3572,8 +3605,11 @@ static vm_fault_t handle_pte_marker(struct vm_fault *vmf) if (WARN_ON_ONCE(vma_is_anonymous(vmf->vma) || !marker)) return VM_FAULT_SIGBUS; - /* TODO: handle pte markers */ - return 0; + if (pte_marker_entry_uffd_wp(entry)) + return pte_marker_handle_uffd_wp(vmf); + + /* This is an unknown pte marker */ + return VM_FAULT_SIGBUS; } /* @@ -4157,6 +4193,7 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) void do_set_pte(struct vm_fault *vmf, struct page *page, unsigned long addr) { struct vm_area_struct *vma = vmf->vma; + bool uffd_wp = pte_marker_uffd_wp(vmf->orig_pte); bool write = vmf->flags & FAULT_FLAG_WRITE; bool prefault = vmf->address != addr; pte_t entry; @@ -4171,6 +4208,8 @@ void do_set_pte(struct vm_fault *vmf, struct page *page, unsigned long addr) if (write) entry = maybe_mkwrite(pte_mkdirty(entry), vma); + if (unlikely(uffd_wp)) + entry = pte_mkuffd_wp(pte_wrprotect(entry)); /* copy-on-write page */ if (write && !(vma->vm_flags & VM_SHARED)) { inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); @@ -4344,9 +4383,21 @@ static vm_fault_t do_fault_around(struct vm_fault *vmf) return vmf->vma->vm_ops->map_pages(vmf, start_pgoff, end_pgoff); } +/* Return true if we should do read fault-around, false otherwise */ +static inline bool should_fault_around(struct vm_fault *vmf) +{ + /* No ->map_pages? No way to fault around... */ + if (!vmf->vma->vm_ops->map_pages) + return false; + + if (uffd_disable_fault_around(vmf->vma)) + return false; + + return fault_around_bytes >> PAGE_SHIFT > 1; +} + static vm_fault_t do_read_fault(struct vm_fault *vmf) { - struct vm_area_struct *vma = vmf->vma; vm_fault_t ret = 0; /* @@ -4354,12 +4405,10 @@ static vm_fault_t do_read_fault(struct vm_fault *vmf) * if page by the offset is not ready to be mapped (cold cache or * something). */ - if (vma->vm_ops->map_pages && fault_around_bytes >> PAGE_SHIFT > 1) { - if (likely(!userfaultfd_minor(vmf->vma))) { - ret = do_fault_around(vmf); - if (ret) - return ret; - } + if (should_fault_around(vmf)) { + ret = do_fault_around(vmf); + if (ret) + return ret; } ret = __do_fault(vmf); From patchwork Tue Apr 5 01:48:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12800966 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1368C433F5 for ; Tue, 5 Apr 2022 01:53:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7985C6B0080; Mon, 4 Apr 2022 21:49:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7476B6B0081; Mon, 4 Apr 2022 21:49:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5C0EE6B0082; Mon, 4 Apr 2022 21:49:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.26]) by kanga.kvack.org (Postfix) with ESMTP id 4F8F76B0080 for ; Mon, 4 Apr 2022 21:49:12 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 304AA22B99 for ; Tue, 5 Apr 2022 01:49:02 +0000 (UTC) X-FDA: 79321142124.08.9BE532E Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf21.hostedemail.com (Postfix) with ESMTP id A3D7D1C0016 for ; Tue, 5 Apr 2022 01:49:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649123341; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ir+2jSPGqDkP1F+Jy2GyRCXdov0XcHTBUOMC/Y4cJsA=; b=Br+z9UpLtdRv8BlREfJRs2XaWv4aB0DEgOemjOoGJWe/wkWoyqVWZWXp0WYvRtKQvACuND AjrvLuNiH7/XnNZSx+csJvoZg/E60WXsAZ1wAGOjWhxT351dY2Vq7uVzRaRjGZAAy7vyLe 74KMudfLMXAetLmW9Wbqd3NytA1vTTA= Received: from mail-il1-f199.google.com (mail-il1-f199.google.com [209.85.166.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-668-paixuplLMESt09A8BBZgRA-1; Mon, 04 Apr 2022 21:48:52 -0400 X-MC-Unique: paixuplLMESt09A8BBZgRA-1 Received: by mail-il1-f199.google.com with SMTP id f18-20020a926a12000000b002be48b02bc6so7180990ilc.17 for ; Mon, 04 Apr 2022 18:48:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ir+2jSPGqDkP1F+Jy2GyRCXdov0XcHTBUOMC/Y4cJsA=; b=ZxT3wMyO5+tdGRWIteE2j6RHJOX+IXi8WrQPWFIe5m8CvsDu2/iDhMm9iV/dfmJrkb Hij4rOzXkk5WkbxTsz757J29qFOU6FhHjwjXToIacD1u19pqSo9ipCR7COjhDqVdZckC 0KE50tDCDxn8XcYN4DH8km6uPsv1+N3JWwlU49snAg7A8uZ1fGwHGsd8ZO7E3DP1bJXz wQmLR/YwR2ieV5ThE/X6MHOVTcpEqf3JmSh3U6ihm2tRSIYDLwuYoQP59qkCBqM6km8s QoR1sNGtaLWPcapBjIDbxX1uMr45KM/A60c+SOjSteNoq8pP9CO/2SYafilg+t5eml9o ggSA== X-Gm-Message-State: AOAM533kuCi25kMNK63iZt7FihxdEkNIjtcPz3gHiiNLq0yFdr878qIx PNy3GrDOlCehMqPNHKq1uOGdnq+0NAkSHDVptaC1lNCZJyDJ7riMnYObQxAhivRep0/PYnY6Ndq c6sMAe6mjUnY= X-Received: by 2002:a02:cc1a:0:b0:323:b8c8:99c7 with SMTP id n26-20020a02cc1a000000b00323b8c899c7mr750398jap.300.1649123330055; Mon, 04 Apr 2022 18:48:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxep3RcWdGwAtcQqF4Uw6sBieBCpnIICgT/Mvu6Lvjw0eziijwHEOn7vjw8uy5CPgpEMYPXfQ== X-Received: by 2002:a02:cc1a:0:b0:323:b8c8:99c7 with SMTP id n26-20020a02cc1a000000b00323b8c899c7mr750379jap.300.1649123329750; Mon, 04 Apr 2022 18:48:49 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id l9-20020a922909000000b002ca4ef64362sm1386018ilg.84.2022.04.04.18.48.48 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Apr 2022 18:48:49 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz , Nadav Amit , Matthew Wilcox , Mike Rapoport , David Hildenbrand , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Andrea Arcangeli , Andrew Morton , Axel Rasmussen , Alistair Popple , peterx@redhat.com Subject: [PATCH v8 07/23] mm/shmem: Persist uffd-wp bit across zapping for file-backed Date: Mon, 4 Apr 2022 21:48:47 -0400 Message-Id: <20220405014847.14295-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220405014646.13522-1-peterx@redhat.com> References: <20220405014646.13522-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspam-User: Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Br+z9UpL; spf=none (imf21.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: A3D7D1C0016 X-Stat-Signature: 8h6eoeqoq9wu4ueeicbf714proj15np7 X-HE-Tag: 1649123341-44976 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: File-backed memory is prone to being unmapped at any time. It means all information in the pte will be dropped, including the uffd-wp flag. To persist the uffd-wp flag, we'll use the pte markers. This patch teaches the zap code to understand uffd-wp and know when to keep or drop the uffd-wp bit. Add a new flag ZAP_FLAG_DROP_MARKER and set it in zap_details when we don't want to persist such an information, for example, when destroying the whole vma, or punching a hole in a shmem file. For the rest cases we should never drop the uffd-wp bit, or the wr-protect information will get lost. The new ZAP_FLAG_DROP_MARKER needs to be put into mm.h rather than memory.c because it'll be further referenced in hugetlb files later. Signed-off-by: Peter Xu --- include/linux/mm.h | 10 ++++++++ include/linux/mm_inline.h | 43 ++++++++++++++++++++++++++++++++++ mm/memory.c | 49 ++++++++++++++++++++++++++++++++++++--- mm/rmap.c | 8 +++++++ 4 files changed, 107 insertions(+), 3 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 26428ff262fc..857bc8f7af45 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3422,4 +3422,14 @@ madvise_set_anon_name(struct mm_struct *mm, unsigned long start, } #endif +typedef unsigned int __bitwise zap_flags_t; + +/* + * Whether to drop the pte markers, for example, the uffd-wp information for + * file-backed memory. This should only be specified when we will completely + * drop the page in the mm, either by truncation or unmapping of the vma. By + * default, the flag is not set. + */ +#define ZAP_FLAG_DROP_MARKER ((__force zap_flags_t) BIT(0)) + #endif /* _LINUX_MM_H */ diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index ac32125745ab..7b25b53c474a 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -6,6 +6,8 @@ #include #include #include +#include +#include /** * folio_is_file_lru - Should the folio be on a file LRU or anon LRU? @@ -316,5 +318,46 @@ static inline bool mm_tlb_flush_nested(struct mm_struct *mm) return atomic_read(&mm->tlb_flush_pending) > 1; } +/* + * If this pte is wr-protected by uffd-wp in any form, arm the special pte to + * replace a none pte. NOTE! This should only be called when *pte is already + * cleared so we will never accidentally replace something valuable. Meanwhile + * none pte also means we are not demoting the pte so tlb flushed is not needed. + * E.g., when pte cleared the caller should have taken care of the tlb flush. + * + * Must be called with pgtable lock held so that no thread will see the none + * pte, and if they see it, they'll fault and serialize at the pgtable lock. + * + * This function is a no-op if PTE_MARKER_UFFD_WP is not enabled. + */ +static inline void +pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr, + pte_t *pte, pte_t pteval) +{ +#ifdef CONFIG_PTE_MARKER_UFFD_WP + bool arm_uffd_pte = false; + + /* The current status of the pte should be "cleared" before calling */ + WARN_ON_ONCE(!pte_none(*pte)); + + if (vma_is_anonymous(vma) || !userfaultfd_wp(vma)) + return; + + /* A uffd-wp wr-protected normal pte */ + if (unlikely(pte_present(pteval) && pte_uffd_wp(pteval))) + arm_uffd_pte = true; + + /* + * A uffd-wp wr-protected swap pte. Note: this should even cover an + * existing pte marker with uffd-wp bit set. + */ + if (unlikely(pte_swp_uffd_wp_any(pteval))) + arm_uffd_pte = true; + + if (unlikely(arm_uffd_pte)) + set_pte_at(vma->vm_mm, addr, pte, + make_pte_marker(PTE_MARKER_UFFD_WP)); +#endif +} #endif diff --git a/mm/memory.c b/mm/memory.c index 21abb8a30553..1144845ff734 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -74,6 +74,7 @@ #include #include #include +#include #include @@ -1306,6 +1307,7 @@ copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma) struct zap_details { struct folio *single_folio; /* Locked folio to be unmapped */ bool even_cows; /* Zap COWed private pages too? */ + zap_flags_t zap_flags; /* Extra flags for zapping */ }; /* Whether we should zap all COWed (private) pages too */ @@ -1334,6 +1336,29 @@ static inline bool should_zap_page(struct zap_details *details, struct page *pag return !PageAnon(page); } +static inline bool zap_drop_file_uffd_wp(struct zap_details *details) +{ + if (!details) + return false; + + return details->zap_flags & ZAP_FLAG_DROP_MARKER; +} + +/* + * This function makes sure that we'll replace the none pte with an uffd-wp + * swap special pte marker when necessary. Must be with the pgtable lock held. + */ +static inline void +zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, + unsigned long addr, pte_t *pte, + struct zap_details *details, pte_t pteval) +{ + if (zap_drop_file_uffd_wp(details)) + return; + + pte_install_uffd_wp_if_needed(vma, addr, pte, pteval); +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, @@ -1371,6 +1396,8 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); tlb_remove_tlb_entry(tlb, pte, addr); + zap_install_uffd_wp_if_needed(vma, addr, pte, details, + ptent); if (unlikely(!page)) continue; @@ -1401,6 +1428,13 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, page = pfn_swap_entry_to_page(entry); if (unlikely(!should_zap_page(details, page))) continue; + /* + * Both device private/exclusive mappings should only + * work with anonymous page so far, so we don't need to + * consider uffd-wp bit when zap. For more information, + * see zap_install_uffd_wp_if_needed(). + */ + WARN_ON_ONCE(!vma_is_anonymous(vma)); rss[mm_counter(page)]--; if (is_device_private_entry(entry)) page_remove_rmap(page, vma, false); @@ -1417,8 +1451,10 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, if (!should_zap_page(details, page)) continue; rss[mm_counter(page)]--; - } else if (is_pte_marker_entry(entry)) { - /* By default, simply drop all pte markers when zap */ + } else if (pte_marker_entry_uffd_wp(entry)) { + /* Only drop the uffd-wp marker if explicitly requested */ + if (!zap_drop_file_uffd_wp(details)) + continue; } else if (is_hwpoison_entry(entry)) { if (!should_zap_cows(details)) continue; @@ -1427,6 +1463,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, WARN_ON_ONCE(1); } pte_clear_not_present_full(mm, addr, pte, tlb->fullmm); + zap_install_uffd_wp_if_needed(vma, addr, pte, details, ptent); } while (pte++, addr += PAGE_SIZE, addr != end); add_mm_rss_vec(mm, rss); @@ -1637,12 +1674,17 @@ void unmap_vmas(struct mmu_gather *tlb, unsigned long end_addr) { struct mmu_notifier_range range; + struct zap_details details = { + .zap_flags = ZAP_FLAG_DROP_MARKER, + /* Careful - we need to zap private pages too! */ + .even_cows = true, + }; mmu_notifier_range_init(&range, MMU_NOTIFY_UNMAP, 0, vma, vma->vm_mm, start_addr, end_addr); mmu_notifier_invalidate_range_start(&range); for ( ; vma && vma->vm_start < end_addr; vma = vma->vm_next) - unmap_single_vma(tlb, vma, start_addr, end_addr, NULL); + unmap_single_vma(tlb, vma, start_addr, end_addr, &details); mmu_notifier_invalidate_range_end(&range); } @@ -3438,6 +3480,7 @@ void unmap_mapping_folio(struct folio *folio) details.even_cows = false; details.single_folio = folio; + details.zap_flags = ZAP_FLAG_DROP_MARKER; i_mmap_lock_read(mapping); if (unlikely(!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root))) diff --git a/mm/rmap.c b/mm/rmap.c index 208b2c683cec..69416072b1a6 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -73,6 +73,7 @@ #include #include #include +#include #include @@ -1538,6 +1539,13 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, pteval = ptep_clear_flush(vma, address, pvmw.pte); } + /* + * Now the pte is cleared. If this pte was uffd-wp armed, + * we may want to replace a none pte with a marker pte if + * it's file-backed, so we don't lose the tracking info. + */ + pte_install_uffd_wp_if_needed(vma, address, pvmw.pte, pteval); + /* Set the dirty flag on the folio now the pte is gone. */ if (pte_dirty(pteval)) folio_mark_dirty(folio); From patchwork Tue Apr 5 01:48:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12800963 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15884C433F5 for ; Tue, 5 Apr 2022 01:51:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 12FF46B007B; Mon, 4 Apr 2022 21:49:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0E0196B007D; Mon, 4 Apr 2022 21:49:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E9C276B007E; Mon, 4 Apr 2022 21:49:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.26]) by kanga.kvack.org (Postfix) with ESMTP id DD2926B007B for ; Mon, 4 Apr 2022 21:49:05 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id AFB4A20313 for ; Tue, 5 Apr 2022 01:48:55 +0000 (UTC) X-FDA: 79321141830.09.A201A57 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf22.hostedemail.com (Postfix) with ESMTP id 2EF51C0015 for ; Tue, 5 Apr 2022 01:48:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649123334; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=l2+qvTtHyC51V4BqYHYp23OrN7TjCUKrDVYDSF5MCVo=; b=i2jquPsINiGP1XXSRK92RnbGEQcLYag7Ac98Jo3H1OzZ773R63p/f8WQY/wb71WtyMIugk Z2tVuXtC38wcSOBuIZnRorPRu+0QZiqHFx7x/e+R0NzSjT/PQC3xDnu6evUfcvYnQlf4gp O97H8V+hLIdUbZ8cOmtlg/E0sI28CuE= Received: from mail-io1-f70.google.com (mail-io1-f70.google.com [209.85.166.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-34-gV1WqSRtMjSzTjf6Rmnt2Q-1; Mon, 04 Apr 2022 21:48:53 -0400 X-MC-Unique: gV1WqSRtMjSzTjf6Rmnt2Q-1 Received: by mail-io1-f70.google.com with SMTP id i19-20020a5d9353000000b006495ab76af6so7507940ioo.0 for ; Mon, 04 Apr 2022 18:48:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=l2+qvTtHyC51V4BqYHYp23OrN7TjCUKrDVYDSF5MCVo=; b=EXjdLWYNcwFqKUFY3ntzieBLXPcBkHJpwYyWEmYWs8f0uyubnZFtbnas4sS8D1XXzy 6cdBX0WKOwqjrWA3gF9WeAkSlUTo5f27ybHrhSE3n3iFe4gtK82e7LJuMVIA3X5PQZAP SCIe5DOHkHEWhibeSDEpOJh4gM07lXhv8Ucc/+5dfj/6CA38cxzy2QlHlzrYs7NJumz2 GPIpcoW822I85hgqp6JA7+ab2ckRi3Qv3zlYkIWal3xMOYHNBAbo50pouPlz7YnfFmMy /XuKPfezmHpi40tU5qZ5zn++eQ44usqtSjPvaRkfY8FOeBFdXDrkudpO3ZAd7ecDRDpG 0w0A== X-Gm-Message-State: AOAM530BeOhPJi9g7xQa4Hml5KixLqlGfWipyqiBa0u8rIEASY3kjCcE BZhG4L7Q1QXokh0y5yQv+Mco2dl0WV76NfvXk7+JVKo0eIuZbUg3dUWFM1VectpLIMtR9l3E3rp t00zjWMlccOE= X-Received: by 2002:a05:6602:14cb:b0:646:3b7d:6aee with SMTP id b11-20020a05660214cb00b006463b7d6aeemr609359iow.178.1649123332928; Mon, 04 Apr 2022 18:48:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwC5SfA3HnFu2jqAo60amfsdIfOUDVqgatfBUTtv0HUIBqFTM3g2TfSh/uOX6Ux3OnPnNRg4g== X-Received: by 2002:a05:6602:14cb:b0:646:3b7d:6aee with SMTP id b11-20020a05660214cb00b006463b7d6aeemr609348iow.178.1649123332673; Mon, 04 Apr 2022 18:48:52 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id g9-20020a056e020d0900b002ca5573dfe8sm514842ilj.22.2022.04.04.18.48.51 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Apr 2022 18:48:52 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz , Nadav Amit , Matthew Wilcox , Mike Rapoport , David Hildenbrand , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Andrea Arcangeli , Andrew Morton , Axel Rasmussen , Alistair Popple , peterx@redhat.com Subject: [PATCH v8 08/23] mm/shmem: Allow uffd wr-protect none pte for file-backed mem Date: Mon, 4 Apr 2022 21:48:50 -0400 Message-Id: <20220405014850.14352-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220405014646.13522-1-peterx@redhat.com> References: <20220405014646.13522-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspam-User: Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=i2jquPsI; spf=none (imf22.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 2EF51C0015 X-Stat-Signature: g57ortww5xkabsu5uumzbtxf544wensf X-HE-Tag: 1649123335-516827 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: File-backed memory differs from anonymous memory in that even if the pte is missing, the data could still resides either in the file or in page/swap cache. So when wr-protect a pte, we need to consider none ptes too. We do that by installing the uffd-wp pte markers when necessary. So when there's a future write to the pte, the fault handler will go the special path to first fault-in the page as read-only, then report to userfaultfd server with the wr-protect message. On the other hand, when unprotecting a page, it's also possible that the pte got unmapped but replaced by the special uffd-wp marker. Then we'll need to be able to recover from a uffd-wp pte marker into a none pte, so that the next access to the page will fault in correctly as usual when accessed the next time. Special care needs to be taken throughout the change_protection_range() process. Since now we allow user to wr-protect a none pte, we need to be able to pre-populate the page table entries if we see (!anonymous && MM_CP_UFFD_WP) requests, otherwise change_protection_range() will always skip when the pgtable entry does not exist. For example, the pgtable can be missing for a whole chunk of 2M pmd, but the page cache can exist for the 2M range. When we want to wr-protect one 4K page within the 2M pmd range, we need to pre-populate the pgtable and install the pte marker showing that we want to get a message and block the thread when the page cache of that 4K page is written. Without pre-populating the pmd, change_protection() will simply skip that whole pmd. Note that this patch only covers the small pages (pte level) but not covering any of the transparent huge pages yet. That will be done later, and this patch will be a preparation for it too. Signed-off-by: Peter Xu --- mm/mprotect.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 62 insertions(+), 2 deletions(-) diff --git a/mm/mprotect.c b/mm/mprotect.c index 709a6f73b764..bd62d5938c6c 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -30,6 +30,7 @@ #include #include #include +#include #include #include #include @@ -188,8 +189,16 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, newpte = pte_swp_mksoft_dirty(newpte); if (pte_swp_uffd_wp(oldpte)) newpte = pte_swp_mkuffd_wp(newpte); - } else if (is_pte_marker_entry(entry)) { - /* Skip it, the same as none pte */ + } else if (pte_marker_entry_uffd_wp(entry)) { + /* + * If this is uffd-wp pte marker and we'd like + * to unprotect it, drop it; the next page + * fault will trigger without uffd trapping. + */ + if (uffd_wp_resolve) { + pte_clear(vma->vm_mm, addr, pte); + pages++; + } continue; } else { newpte = oldpte; @@ -204,6 +213,20 @@ static unsigned long change_pte_range(struct vm_area_struct *vma, pmd_t *pmd, set_pte_at(vma->vm_mm, addr, pte, newpte); pages++; } + } else { + /* It must be an none page, or what else?.. */ + WARN_ON_ONCE(!pte_none(oldpte)); + if (unlikely(uffd_wp && !vma_is_anonymous(vma))) { + /* + * For file-backed mem, we need to be able to + * wr-protect a none pte, because even if the + * pte is none, the page/swap cache could + * exist. Doing that by install a marker. + */ + set_pte_at(vma->vm_mm, addr, pte, + make_pte_marker(PTE_MARKER_UFFD_WP)); + pages++; + } } } while (pte++, addr += PAGE_SIZE, addr != end); arch_leave_lazy_mmu_mode(); @@ -237,6 +260,39 @@ static inline int pmd_none_or_clear_bad_unless_trans_huge(pmd_t *pmd) return 0; } +/* Return true if we're uffd wr-protecting file-backed memory, or false */ +static inline bool +uffd_wp_protect_file(struct vm_area_struct *vma, unsigned long cp_flags) +{ + return (cp_flags & MM_CP_UFFD_WP) && !vma_is_anonymous(vma); +} + +/* + * If wr-protecting the range for file-backed, populate pgtable for the case + * when pgtable is empty but page cache exists. When {pte|pmd|...}_alloc() + * failed it means no memory, we don't have a better option but stop. + */ +#define change_pmd_prepare(vma, pmd, cp_flags) \ + do { \ + if (unlikely(uffd_wp_protect_file(vma, cp_flags))) { \ + if (WARN_ON_ONCE(pte_alloc(vma->vm_mm, pmd))) \ + break; \ + } \ + } while (0) +/* + * This is the general pud/p4d/pgd version of change_pmd_prepare(). We need to + * have separate change_pmd_prepare() because pte_alloc() returns 0 on success, + * while {pmd|pud|p4d}_alloc() returns the valid pointer on success. + */ +#define change_prepare(vma, high, low, addr, cp_flags) \ + do { \ + if (unlikely(uffd_wp_protect_file(vma, cp_flags))) { \ + low##_t *p = low##_alloc(vma->vm_mm, high, addr); \ + if (WARN_ON_ONCE(p == NULL)) \ + break; \ + } \ + } while (0) + static inline unsigned long change_pmd_range(struct vm_area_struct *vma, pud_t *pud, unsigned long addr, unsigned long end, pgprot_t newprot, unsigned long cp_flags) @@ -255,6 +311,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, next = pmd_addr_end(addr, end); + change_pmd_prepare(vma, pmd, cp_flags); /* * Automatic NUMA balancing walks the tables with mmap_lock * held for read. It's possible a parallel update to occur @@ -320,6 +377,7 @@ static inline unsigned long change_pud_range(struct vm_area_struct *vma, pud = pud_offset(p4d, addr); do { next = pud_addr_end(addr, end); + change_prepare(vma, pud, pmd, addr, cp_flags); if (pud_none_or_clear_bad(pud)) continue; pages += change_pmd_range(vma, pud, addr, next, newprot, @@ -340,6 +398,7 @@ static inline unsigned long change_p4d_range(struct vm_area_struct *vma, p4d = p4d_offset(pgd, addr); do { next = p4d_addr_end(addr, end); + change_prepare(vma, p4d, pud, addr, cp_flags); if (p4d_none_or_clear_bad(p4d)) continue; pages += change_pud_range(vma, p4d, addr, next, newprot, @@ -365,6 +424,7 @@ static unsigned long change_protection_range(struct vm_area_struct *vma, inc_tlb_flush_pending(mm); do { next = pgd_addr_end(addr, end); + change_prepare(vma, pgd, p4d, addr, cp_flags); if (pgd_none_or_clear_bad(pgd)) continue; pages += change_p4d_range(vma, pgd, addr, next, newprot, From patchwork Tue Apr 5 01:48:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12800964 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99E0DC433F5 for ; Tue, 5 Apr 2022 01:52:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B41F76B007D; Mon, 4 Apr 2022 21:49:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AF1B36B007E; Mon, 4 Apr 2022 21:49:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 992266B0080; Mon, 4 Apr 2022 21:49:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0195.hostedemail.com [216.40.44.195]) by kanga.kvack.org (Postfix) with ESMTP id 8BCF56B007D for ; Mon, 4 Apr 2022 21:49:10 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 4E79B1828EE03 for ; Tue, 5 Apr 2022 01:49:00 +0000 (UTC) X-FDA: 79321142040.25.7BC0131 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf11.hostedemail.com (Postfix) with ESMTP id CC7AC40018 for ; Tue, 5 Apr 2022 01:48:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649123339; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9Tho0jTi/mP2SxzzY9PiukQljtVmr59NjLeFBDKY3mg=; b=BK59SyfXaXC6GqaemoZ5y188v8MRE49skeFlJtJEj0THIbyXKfIUmD4JFZQ4D2pxpqhh3u vq+L4rxXK+9jNfhvKwYXNsrt4zRoXwOvu6SHtoqLPWR2fQFEdGQR/cHli/Uyb0TBKVSVfD etYEtTcV8mtTvO18i3rOiSvHs4Jcdqw= Received: from mail-il1-f198.google.com (mail-il1-f198.google.com [209.85.166.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-539-Or_uibUHPaa3q1g7I3lHSg-1; Mon, 04 Apr 2022 21:48:56 -0400 X-MC-Unique: Or_uibUHPaa3q1g7I3lHSg-1 Received: by mail-il1-f198.google.com with SMTP id a2-20020a056e020e0200b002ca4850353fso2209362ilk.5 for ; Mon, 04 Apr 2022 18:48:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=9Tho0jTi/mP2SxzzY9PiukQljtVmr59NjLeFBDKY3mg=; b=EyDzA7ce5cRdJMwfbGwc2aD6A5V1g9LAZcdcCNrMaZyAGh1ikssHe8SB9Q0aDPyCL2 bEtBc6tZ3Iu3HqLkUkrpb6lgaI7dOyRu6g5TSA/ZG1/7OjpbODOasvCOtmf6qzpZFZ6P KFRg+bi+XkLXmp40bnZdUt3WpFD55opTlfkiy2Cvkzf65BzS5+5NrHstq7hxDRpyJwpb Ztj3tSyFZd2xyFZQqVzQgbcmRdGT4CQW3ved6aHk3c7DebzITG74PIM/qPLSokrHiugG 264eqsNyKOh7fjFLWddmTh6gSv8q1qV9gdN9WyKGIj7z0GnwM6Fnf//9mJpNPpCb2URR Mq+g== X-Gm-Message-State: AOAM530L0puDP3pHuXGTwi2OgCVD4sLTQ82YzyqfSty3cUdm062O2SBs FB7UqSlRxzlgElLZGRGYeGw3n+JyAhnU7nIZoZyU/Z6ztvW/HTK/+hNYlHp0+G6360PziKzHRL6 HV8J8iaSfbEA= X-Received: by 2002:a6b:e60a:0:b0:646:3e9e:172f with SMTP id g10-20020a6be60a000000b006463e9e172fmr604255ioh.1.1649123335714; Mon, 04 Apr 2022 18:48:55 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwEUfWYkT5v8wLM/eubINNRKFmNMGPSIHFsXbNqMBzFCqBGDzRGSMNc6UHIgjk9FfldR5FQwA== X-Received: by 2002:a6b:e60a:0:b0:646:3e9e:172f with SMTP id g10-20020a6be60a000000b006463e9e172fmr604247ioh.1.1649123335505; Mon, 04 Apr 2022 18:48:55 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id k6-20020a6b4006000000b00649d7111ebasm7563860ioa.0.2022.04.04.18.48.54 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Apr 2022 18:48:55 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz , Nadav Amit , Matthew Wilcox , Mike Rapoport , David Hildenbrand , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Andrea Arcangeli , Andrew Morton , Axel Rasmussen , Alistair Popple , peterx@redhat.com Subject: [PATCH v8 09/23] mm/shmem: Allows file-back mem to be uffd wr-protected on thps Date: Mon, 4 Apr 2022 21:48:52 -0400 Message-Id: <20220405014852.14413-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220405014646.13522-1-peterx@redhat.com> References: <20220405014646.13522-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: CC7AC40018 X-Stat-Signature: kg9amhxhwgkongjo3he4skib616wy6ya X-Rspam-User: Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=BK59SyfX; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf11.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com X-HE-Tag: 1649123339-695344 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We don't have "huge" version of pte markers, instead when necessary we split the thp. However split the thp is not enough, because file-backed thp is handled totally differently comparing to anonymous thps: rather than doing a real split, the thp pmd will simply got cleared in __split_huge_pmd_locked(). That is not enough if e.g. when there is a thp covers range [0, 2M) but we want to wr-protect small page resides in [4K, 8K) range, because after __split_huge_pmd() returns, there will be a none pmd, and change_pmd_range() will just skip it right after the split. Here we leverage the previously introduced change_pmd_prepare() macro so that we'll populate the pmd with a pgtable page after the pmd split (in which process the pmd will be cleared for cases like shmem). Then change_pte_range() will do all the rest for us by installing the uffd-wp pte marker at any none pte that we'd like to wr-protect. Signed-off-by: Peter Xu --- mm/mprotect.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/mm/mprotect.c b/mm/mprotect.c index bd62d5938c6c..e0a567b66d07 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -333,8 +333,15 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma, } if (is_swap_pmd(*pmd) || pmd_trans_huge(*pmd) || pmd_devmap(*pmd)) { - if (next - addr != HPAGE_PMD_SIZE) { + if ((next - addr != HPAGE_PMD_SIZE) || + uffd_wp_protect_file(vma, cp_flags)) { __split_huge_pmd(vma, pmd, addr, false, NULL); + /* + * For file-backed, the pmd could have been + * cleared; make sure pmd populated if + * necessary, then fall-through to pte level. + */ + change_pmd_prepare(vma, pmd, cp_flags); } else { int nr_ptes = change_huge_pmd(vma, pmd, addr, newprot, cp_flags); From patchwork Tue Apr 5 01:48:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12800965 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F15EFC433EF for ; Tue, 5 Apr 2022 01:52:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9E8B16B007E; Mon, 4 Apr 2022 21:49:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 991146B0080; Mon, 4 Apr 2022 21:49:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 85AB66B0081; Mon, 4 Apr 2022 21:49:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0157.hostedemail.com [216.40.44.157]) by kanga.kvack.org (Postfix) with ESMTP id 785D36B007E for ; Mon, 4 Apr 2022 21:49:11 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 27E57ACF54 for ; Tue, 5 Apr 2022 01:49:01 +0000 (UTC) X-FDA: 79321142082.25.BDDD1B0 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf09.hostedemail.com (Postfix) with ESMTP id 94AC2140032 for ; Tue, 5 Apr 2022 01:49:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649123340; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Xu2h9LPZD95UgWg3OX67cvf1aND/vrgeeBhE/APRTdQ=; b=GqFsZYiu0/HgLWNMWwGc+8AOZyOdm/K4yUCZ9xJa8BJdHQW3ASYD2VK2RqNK7ziSlQZesb ZOT5SBae094pRkp3CuZnKtmzcIm4J4oZyN6y2xNMUhLMMUe/UhmBMA2xj0bX1CAsaGWOrv 12pQEL8SCjag93vMfnP552jIW1JLqbk= Received: from mail-io1-f70.google.com (mail-io1-f70.google.com [209.85.166.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-582-oM4poEp_MXOfm12Qgmybog-1; Mon, 04 Apr 2022 21:48:59 -0400 X-MC-Unique: oM4poEp_MXOfm12Qgmybog-1 Received: by mail-io1-f70.google.com with SMTP id g11-20020a056602072b00b00645cc0735d7so7480465iox.1 for ; Mon, 04 Apr 2022 18:48:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Xu2h9LPZD95UgWg3OX67cvf1aND/vrgeeBhE/APRTdQ=; b=rGJbS22WNnUth/KGZW4nwIl5WprjZs8/l28IAEQTnKM8yFB5Ktf7xMwRzRtdNOr89w MlUzWyMkT8QdwQtkT04TUDoqxkX4o0Jwrws8ufDjcKvFA+4WtHw4MA8rvKz7az6PVxUv wB4OAkIiXFcmRbB+JnbT0HsvzfqprFUvzpwq1OHTMgbpuh+Y651lZXP0OGO7rI4R4mCY qs/Y1gEDGBd9FnCGbQ7Lc7pX+ql+BMXfYXOrthaQKEXY11As0QmqrWTsDdxLhxToL+Gf HjVvMTrjGeVCOuqzfreKfzvf3LBydsbymAyXK4i/uDfTDnwYjk7AAAmM4k/iMR1AgM9v 02Yg== X-Gm-Message-State: AOAM531Fu7qZWy3fPwCp0hMx/irNET3s529kNI6pb5g9HH+anqtrxNZL eli3cE+i34avJ/S1B4KZoUN75bYzK6AwToHESjx/lg+6nz/eeIMAhYHvrt5YiO0g1zlSF86qXzh Ds8i4fzHQtec= X-Received: by 2002:a05:6638:d87:b0:323:c006:3650 with SMTP id l7-20020a0566380d8700b00323c0063650mr709875jaj.64.1649123338425; Mon, 04 Apr 2022 18:48:58 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwMFmhYcxtKesF8e8jA4ez0VAvz2QlCxEYSpJ9E7Y0EzGtvgbMSWc0UDKb86qoEwbB54Mdr3g== X-Received: by 2002:a05:6638:d87:b0:323:c006:3650 with SMTP id l7-20020a0566380d8700b00323c0063650mr709862jaj.64.1649123338140; Mon, 04 Apr 2022 18:48:58 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id u15-20020a92d1cf000000b002ca56804ec4sm473668ilg.23.2022.04.04.18.48.56 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Apr 2022 18:48:57 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz , Nadav Amit , Matthew Wilcox , Mike Rapoport , David Hildenbrand , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Andrea Arcangeli , Andrew Morton , Axel Rasmussen , Alistair Popple , peterx@redhat.com Subject: [PATCH v8 10/23] mm/shmem: Handle uffd-wp during fork() Date: Mon, 4 Apr 2022 21:48:55 -0400 Message-Id: <20220405014855.14468-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220405014646.13522-1-peterx@redhat.com> References: <20220405014646.13522-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Stat-Signature: 4ea7p551qsp6arguuow9ye45zk8p59zk Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=GqFsZYiu; spf=none (imf09.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspam-User: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 94AC2140032 X-HE-Tag: 1649123340-334746 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Normally we skip copy page when fork() for VM_SHARED shmem, but we can't skip it anymore if uffd-wp is enabled on dst vma. This should only happen when the src uffd has UFFD_FEATURE_EVENT_FORK enabled on uffd-wp shmem vma, so that VM_UFFD_WP will be propagated onto dst vma too, then we should copy the pgtables with uffd-wp bit and pte markers, because these information will be lost otherwise. Since the condition checks will become even more complicated for deciding "whether a vma needs to copy the pgtable during fork()", introduce a helper vma_needs_copy() for it, so everything will be clearer. Signed-off-by: Peter Xu Reported-by: kernel test robot --- mm/memory.c | 49 +++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 41 insertions(+), 8 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 1144845ff734..8ba1bb196095 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -867,6 +867,14 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, if (try_restore_exclusive_pte(src_pte, src_vma, addr)) return -EBUSY; return -ENOENT; + } else if (is_pte_marker_entry(entry)) { + /* + * We're copying the pgtable should only because dst_vma has + * uffd-wp enabled, do sanity check. + */ + WARN_ON_ONCE(!userfaultfd_wp(dst_vma)); + set_pte_at(dst_mm, addr, dst_pte, pte); + return 0; } if (!userfaultfd_wp(dst_vma)) pte = pte_swp_clear_uffd_wp(pte); @@ -1221,6 +1229,38 @@ copy_p4d_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, return 0; } +/* + * Return true if the vma needs to copy the pgtable during this fork(). Return + * false when we can speed up fork() by allowing lazy page faults later until + * when the child accesses the memory range. + */ +bool +vma_needs_copy(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma) +{ + /* + * Always copy pgtables when dst_vma has uffd-wp enabled even if it's + * file-backed (e.g. shmem). Because when uffd-wp is enabled, pgtable + * contains uffd-wp protection information, that's something we can't + * retrieve from page cache, and skip copying will lose those info. + */ + if (userfaultfd_wp(dst_vma)) + return true; + + if (src_vma->vm_flags & (VM_HUGETLB | VM_PFNMAP | VM_MIXEDMAP)) + return true; + + if (src_vma->anon_vma) + return true; + + /* + * Don't copy ptes where a page fault will fill them correctly. Fork + * becomes much lighter when there are big shared or private readonly + * mappings. The tradeoff is that copy_page_range is more efficient + * than faulting. + */ + return false; +} + int copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma) { @@ -1234,14 +1274,7 @@ copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma) bool is_cow; int ret; - /* - * Don't copy ptes where a page fault will fill them correctly. - * Fork becomes much lighter when there are big shared or private - * readonly mappings. The tradeoff is that copy_page_range is more - * efficient than faulting. - */ - if (!(src_vma->vm_flags & (VM_HUGETLB | VM_PFNMAP | VM_MIXEDMAP)) && - !src_vma->anon_vma) + if (!vma_needs_copy(dst_vma, src_vma)) return 0; if (is_vm_hugetlb_page(src_vma)) From patchwork Tue Apr 5 01:48:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12800967 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9623C433EF for ; Tue, 5 Apr 2022 01:53:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 353586B0081; Mon, 4 Apr 2022 21:49:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2D9DC6B0082; Mon, 4 Apr 2022 21:49:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 17C4D6B0083; Mon, 4 Apr 2022 21:49:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.27]) by kanga.kvack.org (Postfix) with ESMTP id 0B7ED6B0081 for ; Mon, 4 Apr 2022 21:49:14 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id CF8EC22DD6 for ; Tue, 5 Apr 2022 01:49:03 +0000 (UTC) X-FDA: 79321142166.10.DBB93BD Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf06.hostedemail.com (Postfix) with ESMTP id 4A6D3180011 for ; Tue, 5 Apr 2022 01:49:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649123342; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1OYbScRjTNz5QBi4xGatPlRbPmUiT0LOinB85jUkB/M=; b=ZsMtWDH5rRH5nZ19PJUAMIR0xbZ1BIC/pi4MJGD3BF5JxHsmQJwrWAvoKzYSEcBB1gYCLn sfaG6efPKqM/iudAPxQF6qV/EhF/BnsAt5zsFc187xOjwzcQ+YaT7UX15+T8egyPlNOQKN QO9M/NQWa3BzzbtzkKgg3jr4T3tF6vc= Received: from mail-il1-f198.google.com (mail-il1-f198.google.com [209.85.166.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-107-CCH8UBC-N5OSY1Bo_McK3Q-1; Mon, 04 Apr 2022 21:49:01 -0400 X-MC-Unique: CCH8UBC-N5OSY1Bo_McK3Q-1 Received: by mail-il1-f198.google.com with SMTP id o2-20020a056e02114200b002ca3429fc20so3407998ill.11 for ; Mon, 04 Apr 2022 18:49:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=1OYbScRjTNz5QBi4xGatPlRbPmUiT0LOinB85jUkB/M=; b=bxjnH4Ic2yWWan+VEX5VXhfie8+uXb8UgjhHR9njHBVW2wWdKvfk+mHfHMTaqy6H4o 1aUje7/OWHuy95QQLdZQV4gfQmw7VrT7+Ug0PJvMdnMfpgvP3p11OepK+yOrZCwMKId6 WXL1eeZhU2JOqJTK8tdU1r2ZAytTu/cGJclcLlASajRDNb18Ud/O/e51uwbTlaI70iqi bnRn9+TyX1RWfIRDnT+rMZ1udIgiJD2r1lMIKW9Vb/vnNHbnvcxS0x4eUKeRoATbLyCK FGedirqTNk7zMKGhn9a2ls1etghXYPhIsbv0SjTuY13vlclSYmSfuGRGKT46FnlaBpjj EdjQ== X-Gm-Message-State: AOAM532rqOmbASULru01gFpt2yM45oJ2yZhRBZFYcSsXMd0Z94j2xCC5 3nQ1+yYrCaT1LSPAgDT/rFRwQMI+z9ZPonUKQgCnpT7DvtubOSdmGIzBb8KIv9zoNWBBm0MIaE7 woAib5UWOkJE= X-Received: by 2002:a05:6638:a3a:b0:323:5c6d:ae20 with SMTP id 26-20020a0566380a3a00b003235c6dae20mr735137jao.80.1649123341174; Mon, 04 Apr 2022 18:49:01 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxMAeWuW/3+txDsVV62vki1LADOgkDFuyCvrbdc9Pj11bhRHmjppURbT8nshi8PLqZuUQv4iQ== X-Received: by 2002:a05:6638:a3a:b0:323:5c6d:ae20 with SMTP id 26-20020a0566380a3a00b003235c6dae20mr735120jao.80.1649123340991; Mon, 04 Apr 2022 18:49:00 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id l14-20020a05660227ce00b00645ebb013c1sm8287007ios.45.2022.04.04.18.48.59 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Apr 2022 18:49:00 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz , Nadav Amit , Matthew Wilcox , Mike Rapoport , David Hildenbrand , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Andrea Arcangeli , Andrew Morton , Axel Rasmussen , Alistair Popple , peterx@redhat.com Subject: [PATCH v8 11/23] mm/hugetlb: Introduce huge pte version of uffd-wp helpers Date: Mon, 4 Apr 2022 21:48:58 -0400 Message-Id: <20220405014858.14531-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220405014646.13522-1-peterx@redhat.com> References: <20220405014646.13522-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Stat-Signature: 81smmbgexaydskbmfekozwp73wtc739z Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ZsMtWDH5; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf06.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 4A6D3180011 X-HE-Tag: 1649123343-954923 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: They will be used in the follow up patches to either check/set/clear uffd-wp bit of a huge pte. So far it reuses all the small pte helpers. Archs can overwrite these versions when necessary (with __HAVE_ARCH_HUGE_PTE_UFFD_WP* macros) in the future. Signed-off-by: Peter Xu --- arch/s390/include/asm/hugetlb.h | 15 +++++++++++++++ include/asm-generic/hugetlb.h | 15 +++++++++++++++ 2 files changed, 30 insertions(+) diff --git a/arch/s390/include/asm/hugetlb.h b/arch/s390/include/asm/hugetlb.h index bea47e7cc6a0..be99eda87f4d 100644 --- a/arch/s390/include/asm/hugetlb.h +++ b/arch/s390/include/asm/hugetlb.h @@ -115,6 +115,21 @@ static inline pte_t huge_pte_modify(pte_t pte, pgprot_t newprot) return pte_modify(pte, newprot); } +static inline pte_t huge_pte_mkuffd_wp(pte_t pte) +{ + return pte; +} + +static inline pte_t huge_pte_clear_uffd_wp(pte_t pte) +{ + return pte; +} + +static inline int huge_pte_uffd_wp(pte_t pte) +{ + return 0; +} + static inline bool gigantic_page_runtime_supported(void) { return true; diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h index f39cad20ffc6..896f341f614d 100644 --- a/include/asm-generic/hugetlb.h +++ b/include/asm-generic/hugetlb.h @@ -35,6 +35,21 @@ static inline pte_t huge_pte_modify(pte_t pte, pgprot_t newprot) return pte_modify(pte, newprot); } +static inline pte_t huge_pte_mkuffd_wp(pte_t pte) +{ + return pte_mkuffd_wp(pte); +} + +static inline pte_t huge_pte_clear_uffd_wp(pte_t pte) +{ + return pte_clear_uffd_wp(pte); +} + +static inline int huge_pte_uffd_wp(pte_t pte) +{ + return pte_uffd_wp(pte); +} + #ifndef __HAVE_ARCH_HUGE_PTE_CLEAR static inline void huge_pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep, unsigned long sz) From patchwork Tue Apr 5 01:49:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12800968 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C75F6C433F5 for ; Tue, 5 Apr 2022 01:54:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5C3936B0082; Mon, 4 Apr 2022 21:49:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 571C96B0083; Mon, 4 Apr 2022 21:49:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 43A566B0085; Mon, 4 Apr 2022 21:49:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.26]) by kanga.kvack.org (Postfix) with ESMTP id 370C76B0082 for ; Mon, 4 Apr 2022 21:49:19 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 1576B6099C for ; Tue, 5 Apr 2022 01:49:09 +0000 (UTC) X-FDA: 79321142418.14.81CB9E3 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf18.hostedemail.com (Postfix) with ESMTP id 889C31C001F for ; Tue, 5 Apr 2022 01:49:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649123348; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cMJ3LqPsnbXMwjHxk9MsuTKHubQJUVkMzfk9UWY8MQ4=; b=ZCYCYzskOGH+8fh1m+cRfwZeN7yh6RfpPX1dJlgWp4dn0ciwcwao91g2Uxmq8U7xRHQWgc FRsi5r+3QdxGAqF/RjeU7Riyu5e0Nstp8jzSrQoD47XBswgIrbHif5D3deXeTXP/c5xHTG aSJhPCxcfdDDkTBdRJpXQwE647Myq2A= Received: from mail-io1-f71.google.com (mail-io1-f71.google.com [209.85.166.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-422-_s34HcUGPXO26RM2Wc0XVQ-1; Mon, 04 Apr 2022 21:49:04 -0400 X-MC-Unique: _s34HcUGPXO26RM2Wc0XVQ-1 Received: by mail-io1-f71.google.com with SMTP id h10-20020a05660224ca00b0064c77aa4477so7426585ioe.17 for ; Mon, 04 Apr 2022 18:49:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cMJ3LqPsnbXMwjHxk9MsuTKHubQJUVkMzfk9UWY8MQ4=; b=VK6igoFQKoDJpMAHr4f+DH6BZ7eRWXTCXuYUAeirAsr3hEFo4Oh4eRR+zb/EEqGxwN 3qYmc3Q3tfnihYGBnIYV2QtKLQxjpwzgCjKZI1C5YqCly0ddDxpD29wjwUpkXy2lF8jx VsllgeyWeJTi6cstn4Cvw2xTmymEiOEiBdfZxn91qtKhqUE0KCkGPlUGHNrIgCaLADQT RHyZ0b4ZdvRfen2s/ePP53OdXVn+ACUFTUFlgELxFwV5Z9q882JpGKoaoKufEPqyD8u0 S0rTar0E9H4832tXrRGXFQxPUB9mN4LeDSSQDdUJRa+IgcCRLi22vwyHT8j1eZ+uPQWD d9hg== X-Gm-Message-State: AOAM531+GcROElGL6WsIHyaoZK7dGq0uu8EygZWSsOuhQNDoctm2gIzf 3F7N6UCoY6x7Hzn0KOKF0u4Q+Rq7WyhcI7LI4ksmEQU0xw5PdRSYZllYfxqhdAcLyqe0LgNTdHd N+YoQ7xYFFZA= X-Received: by 2002:a05:6602:2e10:b0:649:e2d4:3334 with SMTP id o16-20020a0566022e1000b00649e2d43334mr586221iow.210.1649123343972; Mon, 04 Apr 2022 18:49:03 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzbt1ibJTi0gvyMJwC5hy/Js4ILQPnWU9UGGKs5bKIIkDs9NeLq7C5AVpY51XCUy10wipducA== X-Received: by 2002:a05:6602:2e10:b0:649:e2d4:3334 with SMTP id o16-20020a0566022e1000b00649e2d43334mr586205iow.210.1649123343719; Mon, 04 Apr 2022 18:49:03 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id y8-20020a920908000000b002ca38acaa60sm2917919ilg.81.2022.04.04.18.49.02 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Apr 2022 18:49:03 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz , Nadav Amit , Matthew Wilcox , Mike Rapoport , David Hildenbrand , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Andrea Arcangeli , Andrew Morton , Axel Rasmussen , Alistair Popple , peterx@redhat.com Subject: [PATCH v8 12/23] mm/hugetlb: Hook page faults for uffd write protection Date: Mon, 4 Apr 2022 21:49:01 -0400 Message-Id: <20220405014901.14590-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220405014646.13522-1-peterx@redhat.com> References: <20220405014646.13522-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ZCYCYzsk; spf=none (imf18.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 889C31C001F X-Stat-Signature: mjj77ny3oqxzi78mx79zywroc53euzjp X-HE-Tag: 1649123348-624313 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hook up hugetlbfs_fault() with the capability to handle userfaultfd-wp faults. We do this slightly earlier than hugetlb_cow() so that we can avoid taking some extra locks that we definitely don't need. Reviewed-by: Mike Kravetz Signed-off-by: Peter Xu --- mm/hugetlb.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index dd642cfc538b..82df0fcfedf9 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5711,6 +5711,26 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, if (unlikely(!pte_same(entry, huge_ptep_get(ptep)))) goto out_ptl; + /* Handle userfault-wp first, before trying to lock more pages */ + if (userfaultfd_wp(vma) && huge_pte_uffd_wp(huge_ptep_get(ptep)) && + (flags & FAULT_FLAG_WRITE) && !huge_pte_write(entry)) { + struct vm_fault vmf = { + .vma = vma, + .address = haddr, + .real_address = address, + .flags = flags, + }; + + spin_unlock(ptl); + if (pagecache_page) { + unlock_page(pagecache_page); + put_page(pagecache_page); + } + mutex_unlock(&hugetlb_fault_mutex_table[hash]); + i_mmap_unlock_read(mapping); + return handle_userfault(&vmf, VM_UFFD_WP); + } + /* * hugetlb_wp() requires page locks of pte_page(entry) and * pagecache_page, so here we need take the former one From patchwork Tue Apr 5 01:49:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12800969 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86D69C433F5 for ; Tue, 5 Apr 2022 01:54:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8F8186B0083; Mon, 4 Apr 2022 21:49:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8A7EC6B0085; Mon, 4 Apr 2022 21:49:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7206B6B0087; Mon, 4 Apr 2022 21:49:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0243.hostedemail.com [216.40.44.243]) by kanga.kvack.org (Postfix) with ESMTP id 64FA26B0083 for ; Mon, 4 Apr 2022 21:49:20 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 2B30A1828AC85 for ; Tue, 5 Apr 2022 01:49:10 +0000 (UTC) X-FDA: 79321142460.28.E4537D6 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf11.hostedemail.com (Postfix) with ESMTP id B139D40018 for ; Tue, 5 Apr 2022 01:49:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649123349; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=95EZeCyII9xxx3gyIQ0ulWF2wAsEmWfCQ2/ndLgG/cw=; b=Mv7AcGNkNNbq1D8aJcJtFebae95Oo8GcyqjDHzyuOcKysKS2Wvi5JckLtliRxg7M3lknEz rSp13uCw9c/Eb8bwPwZOfmccsTalnswm/eK+fMpgyK711ELObSh63sbAbHJ6ogNwXCb8bN xZJJYHpWcuMzDxxW0Ao0RG5voJj/OI8= Received: from mail-il1-f200.google.com (mail-il1-f200.google.com [209.85.166.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-550--Lz3lX70PGO93-u_lsaClw-1; Mon, 04 Apr 2022 21:49:08 -0400 X-MC-Unique: -Lz3lX70PGO93-u_lsaClw-1 Received: by mail-il1-f200.google.com with SMTP id d13-20020a056e02214d00b002ca4d440f73so1848343ilv.15 for ; Mon, 04 Apr 2022 18:49:08 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=95EZeCyII9xxx3gyIQ0ulWF2wAsEmWfCQ2/ndLgG/cw=; b=mX6c9I4mP02Wyw8p4chhxQ6saCuffUH+7lzZDI3w6vD58SwC98II+kO16UUdVVgmNx bzAnfw7vVzJ/Aas0N5w//cmS3W7G5K14D259/JmZ6zCpzLMCQkURYbseKNI67kzgJrMR +eQLgSf3DKLD85FscespMAhdR60u3Bcv6vqdc3q3fNYZlI/DA49NvgdRNCaz0g1bpx+O ZuFbL/mQu07CnIsMnai+LZH4JyhEDLXdgtJ/iQXSE1hFApnpRL+w0mn8+R3qlVxQJSQw cpKVSR5uQTMAtThFf6w9VdgWLIvf52ErMpQbV7uk2ojwU1kGNhqIq9rdGE0riPlMiUWj /jrA== X-Gm-Message-State: AOAM532MSWLfP8jsPz/et5N6AgudP3ECzk6DcMc4bRFZZ4qY5k/rnzNT kDu2KttGaUtL1Pag9Yymb1BtkOoFm7W1ROpM1sT+3lsUL4SPSe60OppXXefWkhIYncWa6Ajez8y yvSh58+swz4g= X-Received: by 2002:a05:6602:13d5:b0:64c:9ef0:65e1 with SMTP id o21-20020a05660213d500b0064c9ef065e1mr588045iov.157.1649123346612; Mon, 04 Apr 2022 18:49:06 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxvBZ1x9YNEIDP6fPb8GcdToNu45lYsI3Q8dcSYoOOQ7uq3K76BU1AVUi+jy6YnevEjPpvOKw== X-Received: by 2002:a05:6602:13d5:b0:64c:9ef0:65e1 with SMTP id o21-20020a05660213d500b0064c9ef065e1mr588020iov.157.1649123346392; Mon, 04 Apr 2022 18:49:06 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id b15-20020a05660214cf00b0064cb75d7e97sm7836568iow.53.2022.04.04.18.49.05 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Apr 2022 18:49:06 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz , Nadav Amit , Matthew Wilcox , Mike Rapoport , David Hildenbrand , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Andrea Arcangeli , Andrew Morton , Axel Rasmussen , Alistair Popple , peterx@redhat.com Subject: [PATCH v8 13/23] mm/hugetlb: Take care of UFFDIO_COPY_MODE_WP Date: Mon, 4 Apr 2022 21:49:04 -0400 Message-Id: <20220405014904.14643-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220405014646.13522-1-peterx@redhat.com> References: <20220405014646.13522-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Stat-Signature: wtq5fj7ywzjzpurbd3poo4pitjme3mx1 Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Mv7AcGNk; spf=none (imf11.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: B139D40018 X-HE-Tag: 1649123349-81044 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Pass the wp_copy variable into hugetlb_mcopy_atomic_pte() thoughout the stack. Apply the UFFD_WP bit if UFFDIO_COPY_MODE_WP is with UFFDIO_COPY. Hugetlb pages are only managed by hugetlbfs, so we're safe even without setting dirty bit in the huge pte if the page is installed as read-only. However we'd better still keep the dirty bit set for a read-only UFFDIO_COPY pte (when UFFDIO_COPY_MODE_WP bit is set), not only to match what we do with shmem, but also because the page does contain dirty data that the kernel just copied from the userspace. Signed-off-by: Peter Xu --- include/linux/hugetlb.h | 6 ++++-- mm/hugetlb.c | 29 +++++++++++++++++++++++------ mm/userfaultfd.c | 14 +++++++++----- 3 files changed, 36 insertions(+), 13 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 53c1b6082a4c..6347298778b6 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -160,7 +160,8 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, pte_t *dst_pte, unsigned long dst_addr, unsigned long src_addr, enum mcopy_atomic_mode mode, - struct page **pagep); + struct page **pagep, + bool wp_copy); #endif /* CONFIG_USERFAULTFD */ bool hugetlb_reserve_pages(struct inode *inode, long from, long to, struct vm_area_struct *vma, @@ -355,7 +356,8 @@ static inline int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, unsigned long dst_addr, unsigned long src_addr, enum mcopy_atomic_mode mode, - struct page **pagep) + struct page **pagep, + bool wp_copy) { BUG(); return 0; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 82df0fcfedf9..c94deead22b2 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5795,7 +5795,8 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, unsigned long dst_addr, unsigned long src_addr, enum mcopy_atomic_mode mode, - struct page **pagep) + struct page **pagep, + bool wp_copy) { bool is_continue = (mode == MCOPY_ATOMIC_CONTINUE); struct hstate *h = hstate_vma(dst_vma); @@ -5925,7 +5926,12 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, goto out_release_unlock; ret = -EEXIST; - if (!huge_pte_none(huge_ptep_get(dst_pte))) + /* + * We allow to overwrite a pte marker: consider when both MISSING|WP + * registered, we firstly wr-protect a none pte which has no page cache + * page backing it, then access the page. + */ + if (!huge_pte_none_mostly(huge_ptep_get(dst_pte))) goto out_release_unlock; if (vm_shared) { @@ -5935,17 +5941,28 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, hugepage_add_new_anon_rmap(page, dst_vma, dst_addr); } - /* For CONTINUE on a non-shared VMA, don't set VM_WRITE for CoW. */ - if (is_continue && !vm_shared) + /* + * For either: (1) CONTINUE on a non-shared VMA, or (2) UFFDIO_COPY + * with wp flag set, don't set pte write bit. + */ + if (wp_copy || (is_continue && !vm_shared)) writable = 0; else writable = dst_vma->vm_flags & VM_WRITE; _dst_pte = make_huge_pte(dst_vma, page, writable); - if (writable) - _dst_pte = huge_pte_mkdirty(_dst_pte); + /* + * Always mark UFFDIO_COPY page dirty; note that this may not be + * extremely important for hugetlbfs for now since swapping is not + * supported, but we should still be clear in that this page cannot be + * thrown away at will, even if write bit not set. + */ + _dst_pte = huge_pte_mkdirty(_dst_pte); _dst_pte = pte_mkyoung(_dst_pte); + if (wp_copy) + _dst_pte = huge_pte_mkuffd_wp(_dst_pte); + set_huge_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte); (void)huge_ptep_set_access_flags(dst_vma, dst_addr, dst_pte, _dst_pte, diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index b1c875b77fbb..da0b3ed2a6b5 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -304,7 +304,8 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long src_start, unsigned long len, - enum mcopy_atomic_mode mode) + enum mcopy_atomic_mode mode, + bool wp_copy) { int vm_shared = dst_vma->vm_flags & VM_SHARED; ssize_t err; @@ -392,7 +393,7 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, } if (mode != MCOPY_ATOMIC_CONTINUE && - !huge_pte_none(huge_ptep_get(dst_pte))) { + !huge_pte_none_mostly(huge_ptep_get(dst_pte))) { err = -EEXIST; mutex_unlock(&hugetlb_fault_mutex_table[hash]); i_mmap_unlock_read(mapping); @@ -400,7 +401,8 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, } err = hugetlb_mcopy_atomic_pte(dst_mm, dst_pte, dst_vma, - dst_addr, src_addr, mode, &page); + dst_addr, src_addr, mode, &page, + wp_copy); mutex_unlock(&hugetlb_fault_mutex_table[hash]); i_mmap_unlock_read(mapping); @@ -455,7 +457,8 @@ extern ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long src_start, unsigned long len, - enum mcopy_atomic_mode mode); + enum mcopy_atomic_mode mode, + bool wp_copy); #endif /* CONFIG_HUGETLB_PAGE */ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm, @@ -575,7 +578,8 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, */ if (is_vm_hugetlb_page(dst_vma)) return __mcopy_atomic_hugetlb(dst_mm, dst_vma, dst_start, - src_start, len, mcopy_mode); + src_start, len, mcopy_mode, + wp_copy); if (!vma_is_anonymous(dst_vma) && !vma_is_shmem(dst_vma)) goto out_unlock; From patchwork Tue Apr 5 01:49:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12800970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 08039C433F5 for ; Tue, 5 Apr 2022 01:55:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D8DA16B0085; Mon, 4 Apr 2022 21:49:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D3B736B0087; Mon, 4 Apr 2022 21:49:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B8E196B0088; Mon, 4 Apr 2022 21:49:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0219.hostedemail.com [216.40.44.219]) by kanga.kvack.org (Postfix) with ESMTP id AA32C6B0085 for ; Mon, 4 Apr 2022 21:49:22 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 789BD1828D800 for ; Tue, 5 Apr 2022 01:49:12 +0000 (UTC) X-FDA: 79321142544.30.E6F3FF9 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf14.hostedemail.com (Postfix) with ESMTP id 055FB100014 for ; Tue, 5 Apr 2022 01:49:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649123351; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cYO5DrzO5I7Lh9p2TuEQvYaoFZvRSzMSA+DH3SoMHTo=; b=doDjhX1gXyFV7T9yZUeuMq6kpMtkOR0GFc3uIcgmI+6sbA962dDFFrPZ3A1lQcSL0AfP2T TcQ3XBg0cf1kdlGbpAqIEMUdnKs5qZrRN9S4tQYTa51nJ8utjIcRcD6wU30Q67X4Uel2aQ VlMGD2WpoVVc2Rhs7ekUT+BiGvMeqCc= Received: from mail-io1-f72.google.com (mail-io1-f72.google.com [209.85.166.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-14-yJ_x32fEOq-6vJ6qdszQOQ-1; Mon, 04 Apr 2022 21:49:10 -0400 X-MC-Unique: yJ_x32fEOq-6vJ6qdszQOQ-1 Received: by mail-io1-f72.google.com with SMTP id u18-20020a5d8712000000b0064c7a7c497aso7416771iom.18 for ; Mon, 04 Apr 2022 18:49:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cYO5DrzO5I7Lh9p2TuEQvYaoFZvRSzMSA+DH3SoMHTo=; b=gE6Ku4LOT/Vm34SKrRxujtirweQWqmI3rY4AwvRoFZs+5YrdcUnfXg6AfGCrNbuuFE MP3yNZMzyxeLbOB4qwX9b+tFaIsEBEhwRJLFO8yLJHVJzG0z9JFaGtDDYrPbbDIjiFhS KfCLPN3E/VTtEOoaZmrB/5LJTc61IxmaZ70dWbcNl2IX9PyjjLhFlBLlu2GRR9mvQTr7 uKF0HBBMl4lePO49S1tf4tasaLyDDAU7Xi90IvZ3hyP5uaztJwEdqVkGJNF5uEs+7b9X 0hqcF8ChgLy5V3frpr0WNXxCC3qKIrDG13QYWLFlKEWdRKSM64COHEf369rY1YCxApz1 tJdA== X-Gm-Message-State: AOAM530AOJJ+K/jRKsDEGHfDz/WcTmbnkqYOi6NNGiTpXibKMyreX0Wm ulTtzozux6G+tv94fyZpWXwvYjFFd6uKUnSq/RL4gnBYx3fMLk2K5xJOaKu6i7ODckE4DJUG+7o Rn//cn/8ap+w= X-Received: by 2002:a05:6e02:1aa8:b0:2c9:b67e:170a with SMTP id l8-20020a056e021aa800b002c9b67e170amr627222ilv.254.1649123349465; Mon, 04 Apr 2022 18:49:09 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzIRnqjIx5iwXEGGjGc+n3hX5DRyLK+ATAlgFSZlW8F/klF7wxFnaQrfq2feLlPULuSJF4kVA== X-Received: by 2002:a05:6e02:1aa8:b0:2c9:b67e:170a with SMTP id l8-20020a056e021aa800b002c9b67e170amr627210ilv.254.1649123349195; Mon, 04 Apr 2022 18:49:09 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id u15-20020a92d1cf000000b002ca56804ec4sm473939ilg.23.2022.04.04.18.49.07 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Apr 2022 18:49:08 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz , Nadav Amit , Matthew Wilcox , Mike Rapoport , David Hildenbrand , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Andrea Arcangeli , Andrew Morton , Axel Rasmussen , Alistair Popple , peterx@redhat.com Subject: [PATCH v8 14/23] mm/hugetlb: Handle UFFDIO_WRITEPROTECT Date: Mon, 4 Apr 2022 21:49:06 -0400 Message-Id: <20220405014906.14708-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220405014646.13522-1-peterx@redhat.com> References: <20220405014646.13522-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=doDjhX1g; spf=none (imf14.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Stat-Signature: o6b4no6u9k9fanbcqjrtcwym7xhkbaw3 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 055FB100014 X-HE-Tag: 1649123351-754048 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This starts from passing cp_flags into hugetlb_change_protection() so hugetlb will be able to handle MM_CP_UFFD_WP[_RESOLVE] requests. huge_pte_clear_uffd_wp() is introduced to handle the case where the UFFDIO_WRITEPROTECT is requested upon migrating huge page entries. Reviewed-by: Mike Kravetz Signed-off-by: Peter Xu --- include/linux/hugetlb.h | 6 ++++-- mm/hugetlb.c | 13 ++++++++++++- mm/mprotect.c | 3 ++- mm/userfaultfd.c | 8 ++++++++ 4 files changed, 26 insertions(+), 4 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 6347298778b6..38c5ac28b787 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -210,7 +210,8 @@ struct page *follow_huge_pgd(struct mm_struct *mm, unsigned long address, int pmd_huge(pmd_t pmd); int pud_huge(pud_t pud); unsigned long hugetlb_change_protection(struct vm_area_struct *vma, - unsigned long address, unsigned long end, pgprot_t newprot); + unsigned long address, unsigned long end, pgprot_t newprot, + unsigned long cp_flags); bool is_hugetlb_entry_migration(pte_t pte); void hugetlb_unshare_all_pmds(struct vm_area_struct *vma); @@ -391,7 +392,8 @@ static inline void move_hugetlb_state(struct page *oldpage, static inline unsigned long hugetlb_change_protection( struct vm_area_struct *vma, unsigned long address, - unsigned long end, pgprot_t newprot) + unsigned long end, pgprot_t newprot, + unsigned long cp_flags) { return 0; } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index c94deead22b2..2401dd5997b7 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6207,7 +6207,8 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, } unsigned long hugetlb_change_protection(struct vm_area_struct *vma, - unsigned long address, unsigned long end, pgprot_t newprot) + unsigned long address, unsigned long end, + pgprot_t newprot, unsigned long cp_flags) { struct mm_struct *mm = vma->vm_mm; unsigned long start = address; @@ -6217,6 +6218,8 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, unsigned long pages = 0; bool shared_pmd = false; struct mmu_notifier_range range; + bool uffd_wp = cp_flags & MM_CP_UFFD_WP; + bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE; /* * In the case of shared PMDs, the area to flush could be beyond @@ -6263,6 +6266,10 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, entry = make_readable_migration_entry( swp_offset(entry)); newpte = swp_entry_to_pte(entry); + if (uffd_wp) + newpte = pte_swp_mkuffd_wp(newpte); + else if (uffd_wp_resolve) + newpte = pte_swp_clear_uffd_wp(newpte); set_huge_swap_pte_at(mm, address, ptep, newpte, huge_page_size(h)); pages++; @@ -6277,6 +6284,10 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, old_pte = huge_ptep_modify_prot_start(vma, address, ptep); pte = huge_pte_modify(old_pte, newprot); pte = arch_make_huge_pte(pte, shift, vma->vm_flags); + if (uffd_wp) + pte = huge_pte_mkuffd_wp(huge_pte_wrprotect(pte)); + else if (uffd_wp_resolve) + pte = huge_pte_clear_uffd_wp(pte); huge_ptep_modify_prot_commit(vma, address, ptep, old_pte, pte); pages++; } diff --git a/mm/mprotect.c b/mm/mprotect.c index e0a567b66d07..6b0e8c213508 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -455,7 +455,8 @@ unsigned long change_protection(struct vm_area_struct *vma, unsigned long start, BUG_ON((cp_flags & MM_CP_UFFD_WP_ALL) == MM_CP_UFFD_WP_ALL); if (is_vm_hugetlb_page(vma)) - pages = hugetlb_change_protection(vma, start, end, newprot); + pages = hugetlb_change_protection(vma, start, end, newprot, + cp_flags); else pages = change_protection_range(vma, start, end, newprot, cp_flags); diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index da0b3ed2a6b5..58d67f2bf980 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -704,6 +704,7 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, atomic_t *mmap_changing) { struct vm_area_struct *dst_vma; + unsigned long page_mask; pgprot_t newprot; int err; @@ -740,6 +741,13 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, if (!vma_is_anonymous(dst_vma)) goto out_unlock; + if (is_vm_hugetlb_page(dst_vma)) { + err = -EINVAL; + page_mask = vma_kernel_pagesize(dst_vma) - 1; + if ((start & page_mask) || (len & page_mask)) + goto out_unlock; + } + if (enable_wp) newprot = vm_get_page_prot(dst_vma->vm_flags & ~(VM_WRITE)); else From patchwork Tue Apr 5 01:49:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12800971 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41D09C433F5 for ; Tue, 5 Apr 2022 01:55:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9BB8C6B0087; Mon, 4 Apr 2022 21:49:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 96A4A6B0088; Mon, 4 Apr 2022 21:49:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7E5506B0089; Mon, 4 Apr 2022 21:49:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.26]) by kanga.kvack.org (Postfix) with ESMTP id 71BBD6B0087 for ; Mon, 4 Apr 2022 21:49:26 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 52A7120313 for ; Tue, 5 Apr 2022 01:49:16 +0000 (UTC) X-FDA: 79321142712.14.1FE49BB Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf14.hostedemail.com (Postfix) with ESMTP id BBAFA100014 for ; Tue, 5 Apr 2022 01:49:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649123354; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lKVs/J2+ks2bdCQxZbqUvWZ98wqpvNaMBO4b+MQPP2E=; b=Y87n2EALe9UeeCXhzQY3E+MsDHchzN7iiG+4K/oq4Lv+KdfT18UUQ/zQZFuyNu1gEyfOha po0ay9OmQuitUqegDSlquIGCD5GsGzrPD/KZJ2JbZG7bHOmN0VXQs7KCfWeBrEqoYX64/A En0kaKgPzGT32DWeBMwPdYabkwrGROA= Received: from mail-io1-f71.google.com (mail-io1-f71.google.com [209.85.166.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-300-MDFV2Kk9NHy8y5wJd_1hOQ-1; Mon, 04 Apr 2022 21:49:13 -0400 X-MC-Unique: MDFV2Kk9NHy8y5wJd_1hOQ-1 Received: by mail-io1-f71.google.com with SMTP id g11-20020a056602072b00b00645cc0735d7so7480689iox.1 for ; Mon, 04 Apr 2022 18:49:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=lKVs/J2+ks2bdCQxZbqUvWZ98wqpvNaMBO4b+MQPP2E=; b=m8y16PVrJ8+0PF4ooSHvL7V2IhbDax9P8k1WEYF9cPzQUQuS8aqvhyAah54um0gfQ0 yDt5MChQmRY/oX1QhQrHPA2dz67YoBA61HJiRa5vH0hZeNU40BdAAOjbr82hicf7mCeF ESXd4GoCIuEKFcTzcXp5DJrKlpLuxIyvxJVfdnHOh33fvlKpGNoIxf3NH5x9VItc0gHY 4KSc+unj9w23BzoMRX03iQfe1ggd1io5LwvNZF96RJEXHsbF2Zi9qs2NrBn9nagzXrum SrQ1jo65A8PsvzRwlgsRtAZ97uZOF2CW4+zHOqZzj0xigF5IYcj16z8xLDRj9TaVNUCZ 9IQg== X-Gm-Message-State: AOAM531zT3dYLgtCmDJ5egH+7JfYx/JGokAnQ5O3yxxLTUmUYMigR08g Jjk/ce1/jv+OMB8KHF/bnr+pAFYPthQPWhhUzvVDSZfiSiobjL8WP50i4h8C0WAL8d4QmQax92E i1IeXLltyRTg= X-Received: by 2002:a02:ce91:0:b0:323:6d4a:484a with SMTP id y17-20020a02ce91000000b003236d4a484amr707127jaq.311.1649123352400; Mon, 04 Apr 2022 18:49:12 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyupx1oRwn841EYWh23vc7XbLA4GVVEmXYgiyGK3QUl9XTVB2ugR3JABDvh7vEjtnRLX6P2Ww== X-Received: by 2002:a02:ce91:0:b0:323:6d4a:484a with SMTP id y17-20020a02ce91000000b003236d4a484amr707114jaq.311.1649123352138; Mon, 04 Apr 2022 18:49:12 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id r9-20020a6b6009000000b006412abddbbbsm7344446iog.24.2022.04.04.18.49.10 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Apr 2022 18:49:11 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz , Nadav Amit , Matthew Wilcox , Mike Rapoport , David Hildenbrand , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Andrea Arcangeli , Andrew Morton , Axel Rasmussen , Alistair Popple , peterx@redhat.com Subject: [PATCH v8 15/23] mm/hugetlb: Handle pte markers in page faults Date: Mon, 4 Apr 2022 21:49:09 -0400 Message-Id: <20220405014909.14761-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220405014646.13522-1-peterx@redhat.com> References: <20220405014646.13522-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: BBAFA100014 X-Rspam-User: Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Y87n2EAL; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf14.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com X-Stat-Signature: 39nr9yphommwgwyschixa8dhcu4oidjj X-HE-Tag: 1649123354-505306 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Allow hugetlb code to handle pte markers just like none ptes. It's mostly there, we just need to make sure we don't assume hugetlb_no_page() only handles none pte, so when detecting pte change we should use pte_same() rather than pte_none(). We need to pass in the old_pte to do the comparison. Check the original pte to see whether it's a pte marker, if it is, we should recover uffd-wp bit on the new pte to be installed, so that the next write will be trapped by uffd. Signed-off-by: Peter Xu Reported-by: kernel test robot --- mm/hugetlb.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 2401dd5997b7..9317b790161d 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5412,7 +5412,8 @@ static inline vm_fault_t hugetlb_handle_userfault(struct vm_area_struct *vma, static vm_fault_t hugetlb_no_page(struct mm_struct *mm, struct vm_area_struct *vma, struct address_space *mapping, pgoff_t idx, - unsigned long address, pte_t *ptep, unsigned int flags) + unsigned long address, pte_t *ptep, + pte_t old_pte, unsigned int flags) { struct hstate *h = hstate_vma(vma); vm_fault_t ret = VM_FAULT_SIGBUS; @@ -5539,7 +5540,8 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, ptl = huge_pte_lock(h, mm, ptep); ret = 0; - if (!huge_pte_none(huge_ptep_get(ptep))) + /* If pte changed from under us, retry */ + if (!pte_same(huge_ptep_get(ptep), old_pte)) goto backout; if (anon_rmap) { @@ -5549,6 +5551,12 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, page_dup_file_rmap(page, true); new_pte = make_huge_pte(vma, page, ((vma->vm_flags & VM_WRITE) && (vma->vm_flags & VM_SHARED))); + /* + * If this pte was previously wr-protected, keep it wr-protected even + * if populated. + */ + if (unlikely(pte_marker_uffd_wp(old_pte))) + new_pte = huge_pte_wrprotect(huge_pte_mkuffd_wp(new_pte)); set_huge_pte_at(mm, haddr, ptep, new_pte); hugetlb_count_add(pages_per_huge_page(h), mm); @@ -5666,8 +5674,10 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, mutex_lock(&hugetlb_fault_mutex_table[hash]); entry = huge_ptep_get(ptep); - if (huge_pte_none(entry)) { - ret = hugetlb_no_page(mm, vma, mapping, idx, address, ptep, flags); + /* PTE markers should be handled the same way as none pte */ + if (huge_pte_none_mostly(entry)) { + ret = hugetlb_no_page(mm, vma, mapping, idx, address, ptep, + entry, flags); goto out_mutex; } From patchwork Tue Apr 5 01:49:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12800972 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5EA46C433F5 for ; Tue, 5 Apr 2022 01:56:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 54E846B0088; Mon, 4 Apr 2022 21:49:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4FD706B0089; Mon, 4 Apr 2022 21:49:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 39DE26B008A; Mon, 4 Apr 2022 21:49:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0018.hostedemail.com [216.40.44.18]) by kanga.kvack.org (Postfix) with ESMTP id 2C1AE6B0088 for ; Mon, 4 Apr 2022 21:49:28 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id E6FE7ACB7F for ; Tue, 5 Apr 2022 01:49:17 +0000 (UTC) X-FDA: 79321142754.22.3E5BE52 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf03.hostedemail.com (Postfix) with ESMTP id 5EF782000D for ; Tue, 5 Apr 2022 01:49:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649123356; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tYxpsQItjDK7cACelYDmbYGqmACKurZeMNuIXszIjeQ=; b=h0LNy9N8EhOh4XVEDtl4ewWSzzzvYmUFBSqwqOVcz1QtfWCXQD7sSWQLnGYgY13GPMdkog XkbpMRe3G73vCdf2ikaD9BDY4ddsb2GJh6MUG01T3Sy1Iju4G0s9gvoRPxs+NnF3wjJii4 sNQOVPVY/CZJ8O/XqFva1XIDP/m+Xao= Received: from mail-il1-f197.google.com (mail-il1-f197.google.com [209.85.166.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-651-XSmM5VsUNU-3CK4YyezSow-1; Mon, 04 Apr 2022 21:49:15 -0400 X-MC-Unique: XSmM5VsUNU-3CK4YyezSow-1 Received: by mail-il1-f197.google.com with SMTP id q6-20020a056e0215c600b002c2c4091914so7202988ilu.14 for ; Mon, 04 Apr 2022 18:49:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=tYxpsQItjDK7cACelYDmbYGqmACKurZeMNuIXszIjeQ=; b=Wbr2Gz8MAfGfUSwQ5R0L6051Qq6PF+TK+sn1i7RMxzJYYwzksN44Z0pC4m2x9NqpdL jyPb97cYX0DotxYC74/F/FNAFmKEJ8P2MO1eUhO/m6oDTtiycClbXlBCDr6VuOewUFX4 iDSL7W9UxnS1lR8RrLNONQysg2zMnJezSbGai7S3VJn3bd+Bg9uI3f+sgnox/YUL9B2e dUXEGXkC8WN1+wq07HO4jgYzxt+Ghx/Z3BGuwG6QnNBr+5VZrF/QmSbRG3O7z+LqszET eKFA7Yw67siGwaBIia+ffMye89HHOihtM3fwfCaj6HZcm1GHz80XclN7BrKskqv/p+Rh QlWA== X-Gm-Message-State: AOAM533wRjfCLmnw+0HnYZ2tl2fvthZhO4Gr6Y0kIvDms06jqy/asea9 eK7w3LGZis2KU80GJcyafNaT1lYqEKCLq1h6xrfcp4ftnXWKwa8FQZgk5OmFuhtM95/4+wB08ml qmmrpvT7ZHFY= X-Received: by 2002:a92:c545:0:b0:2ca:1066:3d6 with SMTP id a5-20020a92c545000000b002ca106603d6mr616127ilj.229.1649123355268; Mon, 04 Apr 2022 18:49:15 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzDsITZBgZ19aGlVu4ZTn7lfD33BDzVrxkkBzyP8LrQwVfSiH2pEhd9gNDZ7UGnRWBIoTtlXw== X-Received: by 2002:a92:c545:0:b0:2ca:1066:3d6 with SMTP id a5-20020a92c545000000b002ca106603d6mr616119ilj.229.1649123355058; Mon, 04 Apr 2022 18:49:15 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id m5-20020a92c525000000b002ca19cc6e43sm5578396ili.20.2022.04.04.18.49.13 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Apr 2022 18:49:14 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz , Nadav Amit , Matthew Wilcox , Mike Rapoport , David Hildenbrand , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Andrea Arcangeli , Andrew Morton , Axel Rasmussen , Alistair Popple , peterx@redhat.com Subject: [PATCH v8 16/23] mm/hugetlb: Allow uffd wr-protect none ptes Date: Mon, 4 Apr 2022 21:49:12 -0400 Message-Id: <20220405014912.14815-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220405014646.13522-1-peterx@redhat.com> References: <20220405014646.13522-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 5EF782000D X-Stat-Signature: msphj3wfna495i7qj4n6qjzbd5937u4o Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=h0LNy9N8; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf03.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com X-Rspam-User: X-HE-Tag: 1649123357-67956 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Teach hugetlbfs code to wr-protect none ptes just in case the page cache existed for that pte. Meanwhile we also need to be able to recognize a uffd-wp marker pte and remove it for uffd_wp_resolve. Since at it, introduce a variable "psize" to replace all references to the huge page size fetcher. Reviewed-by: Mike Kravetz Signed-off-by: Peter Xu --- mm/hugetlb.c | 28 ++++++++++++++++++++++++---- 1 file changed, 24 insertions(+), 4 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 9317b790161d..578c48ef931a 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6225,7 +6225,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, pte_t *ptep; pte_t pte; struct hstate *h = hstate_vma(vma); - unsigned long pages = 0; + unsigned long pages = 0, psize = huge_page_size(h); bool shared_pmd = false; struct mmu_notifier_range range; bool uffd_wp = cp_flags & MM_CP_UFFD_WP; @@ -6245,13 +6245,19 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, mmu_notifier_invalidate_range_start(&range); i_mmap_lock_write(vma->vm_file->f_mapping); - for (; address < end; address += huge_page_size(h)) { + for (; address < end; address += psize) { spinlock_t *ptl; - ptep = huge_pte_offset(mm, address, huge_page_size(h)); + ptep = huge_pte_offset(mm, address, psize); if (!ptep) continue; ptl = huge_pte_lock(h, mm, ptep); if (huge_pmd_unshare(mm, vma, &address, ptep)) { + /* + * When uffd-wp is enabled on the vma, unshare + * shouldn't happen at all. Warn about it if it + * happened due to some reason. + */ + WARN_ON_ONCE(uffd_wp || uffd_wp_resolve); pages++; spin_unlock(ptl); shared_pmd = true; @@ -6281,12 +6287,20 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, else if (uffd_wp_resolve) newpte = pte_swp_clear_uffd_wp(newpte); set_huge_swap_pte_at(mm, address, ptep, - newpte, huge_page_size(h)); + newpte, psize); pages++; } spin_unlock(ptl); continue; } + if (unlikely(pte_marker_uffd_wp(pte))) { + /* + * This is changing a non-present pte into a none pte, + * no need for huge_ptep_modify_prot_start/commit(). + */ + if (uffd_wp_resolve) + huge_pte_clear(mm, address, ptep, psize); + } if (!huge_pte_none(pte)) { pte_t old_pte; unsigned int shift = huge_page_shift(hstate_vma(vma)); @@ -6300,6 +6314,12 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, pte = huge_pte_clear_uffd_wp(pte); huge_ptep_modify_prot_commit(vma, address, ptep, old_pte, pte); pages++; + } else { + /* None pte */ + if (unlikely(uffd_wp)) + /* Safe to modify directly (none->non-present). */ + set_huge_pte_at(mm, address, ptep, + make_pte_marker(PTE_MARKER_UFFD_WP)); } spin_unlock(ptl); } From patchwork Tue Apr 5 01:49:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12800973 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D80BEC433F5 for ; Tue, 5 Apr 2022 01:56:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8F89E6B0089; Mon, 4 Apr 2022 21:49:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 880BC6B008A; Mon, 4 Apr 2022 21:49:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 721646B008C; Mon, 4 Apr 2022 21:49:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0052.hostedemail.com [216.40.44.52]) by kanga.kvack.org (Postfix) with ESMTP id 641DC6B0089 for ; Mon, 4 Apr 2022 21:49:31 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 25A97ACF52 for ; Tue, 5 Apr 2022 01:49:21 +0000 (UTC) X-FDA: 79321142922.18.429C1B4 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf14.hostedemail.com (Postfix) with ESMTP id 9B54410001E for ; Tue, 5 Apr 2022 01:49:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649123360; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2hUkP0cgtQwlbYC5UnwB6jjYdUXBTnNxAup2wyXqQZs=; b=FRkKQx98m3edwDUK1nC12RRNQAQo9nzr6t6vyl5iI9ag6/IdrcS6WHGcpJgvA0k9Z2WWXS yufiEklNusZrc5iZ3PqkleE96ydq7EslAEZ5b7x/HB0q5vZrVJSdUCrDouEQbox7u8XqYO bPzZcE+oBgXB6PODMeNu5PqLoT/TmBI= Received: from mail-io1-f72.google.com (mail-io1-f72.google.com [209.85.166.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-627-YxJglIj4O_CeJe6aLU3T6Q-1; Mon, 04 Apr 2022 21:49:19 -0400 X-MC-Unique: YxJglIj4O_CeJe6aLU3T6Q-1 Received: by mail-io1-f72.google.com with SMTP id f7-20020a056602088700b00645ebbe277cso7409536ioz.22 for ; Mon, 04 Apr 2022 18:49:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2hUkP0cgtQwlbYC5UnwB6jjYdUXBTnNxAup2wyXqQZs=; b=LF0GxWKeSVe2Yj1tYc28JxulZ4y2Qsw/9y8DTHkCG9Jbeho7BMioFsZihWt8cCqwp8 jRRAwPM4nqOPofembjyfyvN3Opk3+8GUC0nbTwUIxmaCfyxgiUiLE4V6UvN9VrVaPz8y RHCHRzUwxAKz3yVVIet8PvwH17p8VnMEMh5/Bg/GqSmgH1pJmR5rhI+klytvsPK1yQsj V3QldZCvLIfctdI0dbUvmVX83BJRBnWcxjN5/wKnwQ9OgapVIVwwAkAaM86KJisbqv1T bv40MIk3R8aLz9AGp4/+V4cfzGH61KC8NSTY8Y98jjHNvx0RQt0tJyCyJkvkNVGm+Rkx pyQA== X-Gm-Message-State: AOAM533dQXjSpxhb4Lx8y54edU5TMvlUOf6KdYvYFotaE2aLBeUBxfKM bWBITN1JcCohwTPCDJgcTlGohHQ8+p7/vSck9ge96KLaaCEXWxPqj0CZ77pIFm5Soxf0QPIgEef mYwPsseMfmjA= X-Received: by 2002:a02:85ac:0:b0:323:4099:dee0 with SMTP id d41-20020a0285ac000000b003234099dee0mr636218jai.189.1649123358172; Mon, 04 Apr 2022 18:49:18 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx10kJiNJ151ApjRhb7SjuIvmsEDPjVdezJB8ZAXTGrM1Yb6bc+vTyyR4bDx+mEDmR1g9tcNQ== X-Received: by 2002:a02:85ac:0:b0:323:4099:dee0 with SMTP id d41-20020a0285ac000000b003234099dee0mr636201jai.189.1649123357903; Mon, 04 Apr 2022 18:49:17 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id a3-20020a5ec303000000b006496b4dd21csm7250821iok.5.2022.04.04.18.49.16 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Apr 2022 18:49:17 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz , Nadav Amit , Matthew Wilcox , Mike Rapoport , David Hildenbrand , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Andrea Arcangeli , Andrew Morton , Axel Rasmussen , Alistair Popple , peterx@redhat.com Subject: [PATCH v8 17/23] mm/hugetlb: Only drop uffd-wp special pte if required Date: Mon, 4 Apr 2022 21:49:15 -0400 Message-Id: <20220405014915.14873-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220405014646.13522-1-peterx@redhat.com> References: <20220405014646.13522-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 9B54410001E X-Stat-Signature: tq3yasnqxur1wymcn35jq45mq6j3yu5t Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=FRkKQx98; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf14.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com X-Rspam-User: X-HE-Tag: 1649123360-461762 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: As with shmem uffd-wp special ptes, only drop the uffd-wp special swap pte if unmapping an entire vma or synchronized such that faults can not race with the unmap operation. This requires passing zap_flags all the way to the lowest level hugetlb unmap routine: __unmap_hugepage_range. In general, unmap calls originated in hugetlbfs code will pass the ZAP_FLAG_DROP_MARKER flag as synchronization is in place to prevent faults. The exception is hole punch which will first unmap without any synchronization. Later when hole punch actually removes the page from the file, it will check to see if there was a subsequent fault and if so take the hugetlb fault mutex while unmapping again. This second unmap will pass in ZAP_FLAG_DROP_MARKER. The justification of "whether to apply ZAP_FLAG_DROP_MARKER flag when unmap a hugetlb range" is (IMHO): we should never reach a state when a page fault could errornously fault in a page-cache page that was wr-protected to be writable, even in an extremely short period. That could happen if e.g. we pass ZAP_FLAG_DROP_MARKER when hugetlbfs_punch_hole() calls hugetlb_vmdelete_list(), because if a page faults after that call and before remove_inode_hugepages() is executed, the page cache can be mapped writable again in the small racy window, that can cause unexpected data overwritten. Reviewed-by: Mike Kravetz Signed-off-by: Peter Xu --- fs/hugetlbfs/inode.c | 15 +++++++++------ include/linux/hugetlb.h | 8 +++++--- mm/hugetlb.c | 33 +++++++++++++++++++++++++-------- mm/memory.c | 5 ++++- 4 files changed, 43 insertions(+), 18 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 99c7477cee5c..8b5b9df2be7d 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -404,7 +404,8 @@ static void remove_huge_page(struct page *page) } static void -hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end) +hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end, + unsigned long zap_flags) { struct vm_area_struct *vma; @@ -438,7 +439,7 @@ hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end) } unmap_hugepage_range(vma, vma->vm_start + v_offset, v_end, - NULL); + NULL, zap_flags); } } @@ -516,7 +517,8 @@ static void remove_inode_hugepages(struct inode *inode, loff_t lstart, mutex_lock(&hugetlb_fault_mutex_table[hash]); hugetlb_vmdelete_list(&mapping->i_mmap, index * pages_per_huge_page(h), - (index + 1) * pages_per_huge_page(h)); + (index + 1) * pages_per_huge_page(h), + ZAP_FLAG_DROP_MARKER); i_mmap_unlock_write(mapping); } @@ -582,7 +584,8 @@ static void hugetlb_vmtruncate(struct inode *inode, loff_t offset) i_mmap_lock_write(mapping); i_size_write(inode, offset); if (!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root)) - hugetlb_vmdelete_list(&mapping->i_mmap, pgoff, 0); + hugetlb_vmdelete_list(&mapping->i_mmap, pgoff, 0, + ZAP_FLAG_DROP_MARKER); i_mmap_unlock_write(mapping); remove_inode_hugepages(inode, offset, LLONG_MAX); } @@ -615,8 +618,8 @@ static long hugetlbfs_punch_hole(struct inode *inode, loff_t offset, loff_t len) i_mmap_lock_write(mapping); if (!RB_EMPTY_ROOT(&mapping->i_mmap.rb_root)) hugetlb_vmdelete_list(&mapping->i_mmap, - hole_start >> PAGE_SHIFT, - hole_end >> PAGE_SHIFT); + hole_start >> PAGE_SHIFT, + hole_end >> PAGE_SHIFT, 0); i_mmap_unlock_write(mapping); remove_inode_hugepages(inode, hole_start, hole_end); inode_unlock(inode); diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 38c5ac28b787..ab48b3bbb0e6 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -143,11 +143,12 @@ long follow_hugetlb_page(struct mm_struct *, struct vm_area_struct *, unsigned long *, unsigned long *, long, unsigned int, int *); void unmap_hugepage_range(struct vm_area_struct *, - unsigned long, unsigned long, struct page *); + unsigned long, unsigned long, struct page *, + unsigned long); void __unmap_hugepage_range_final(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start, unsigned long end, - struct page *ref_page); + struct page *ref_page, unsigned long zap_flags); void hugetlb_report_meminfo(struct seq_file *); int hugetlb_report_node_meminfo(char *buf, int len, int nid); void hugetlb_show_meminfo(void); @@ -400,7 +401,8 @@ static inline unsigned long hugetlb_change_protection( static inline void __unmap_hugepage_range_final(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start, - unsigned long end, struct page *ref_page) + unsigned long end, struct page *ref_page, + unsigned long zap_flags) { BUG(); } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 578c48ef931a..e4af8b357b90 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4947,7 +4947,7 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma, static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start, unsigned long end, - struct page *ref_page) + struct page *ref_page, unsigned long zap_flags) { struct mm_struct *mm = vma->vm_mm; unsigned long address; @@ -5003,7 +5003,18 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct * unmapped and its refcount is dropped, so just clear pte here. */ if (unlikely(!pte_present(pte))) { - huge_pte_clear(mm, address, ptep, sz); + /* + * If the pte was wr-protected by uffd-wp in any of the + * swap forms, meanwhile the caller does not want to + * drop the uffd-wp bit in this zap, then replace the + * pte with a marker. + */ + if (pte_swp_uffd_wp_any(pte) && + !(zap_flags & ZAP_FLAG_DROP_MARKER)) + set_huge_pte_at(mm, address, ptep, + make_pte_marker(PTE_MARKER_UFFD_WP)); + else + huge_pte_clear(mm, address, ptep, sz); spin_unlock(ptl); continue; } @@ -5031,7 +5042,11 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct tlb_remove_huge_tlb_entry(h, tlb, ptep, address); if (huge_pte_dirty(pte)) set_page_dirty(page); - + /* Leave a uffd-wp pte marker if needed */ + if (huge_pte_uffd_wp(pte) && + !(zap_flags & ZAP_FLAG_DROP_MARKER)) + set_huge_pte_at(mm, address, ptep, + make_pte_marker(PTE_MARKER_UFFD_WP)); hugetlb_count_sub(pages_per_huge_page(h), mm); page_remove_rmap(page, vma, true); @@ -5065,9 +5080,10 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct void __unmap_hugepage_range_final(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start, - unsigned long end, struct page *ref_page) + unsigned long end, struct page *ref_page, + unsigned long zap_flags) { - __unmap_hugepage_range(tlb, vma, start, end, ref_page); + __unmap_hugepage_range(tlb, vma, start, end, ref_page, zap_flags); /* * Clear this flag so that x86's huge_pmd_share page_table_shareable @@ -5083,12 +5099,13 @@ void __unmap_hugepage_range_final(struct mmu_gather *tlb, } void unmap_hugepage_range(struct vm_area_struct *vma, unsigned long start, - unsigned long end, struct page *ref_page) + unsigned long end, struct page *ref_page, + unsigned long zap_flags) { struct mmu_gather tlb; tlb_gather_mmu(&tlb, vma->vm_mm); - __unmap_hugepage_range(&tlb, vma, start, end, ref_page); + __unmap_hugepage_range(&tlb, vma, start, end, ref_page, zap_flags); tlb_finish_mmu(&tlb); } @@ -5143,7 +5160,7 @@ static void unmap_ref_private(struct mm_struct *mm, struct vm_area_struct *vma, */ if (!is_vma_resv_set(iter_vma, HPAGE_RESV_OWNER)) unmap_hugepage_range(iter_vma, address, - address + huge_page_size(h), page); + address + huge_page_size(h), page, 0); } i_mmap_unlock_write(mapping); } diff --git a/mm/memory.c b/mm/memory.c index 8ba1bb196095..9808edfe18d4 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1675,8 +1675,11 @@ static void unmap_single_vma(struct mmu_gather *tlb, * safe to do nothing in this case. */ if (vma->vm_file) { + unsigned long zap_flags = details ? + details->zap_flags : 0; i_mmap_lock_write(vma->vm_file->f_mapping); - __unmap_hugepage_range_final(tlb, vma, start, end, NULL); + __unmap_hugepage_range_final(tlb, vma, start, end, + NULL, zap_flags); i_mmap_unlock_write(vma->vm_file->f_mapping); } } else From patchwork Tue Apr 5 01:49:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12800974 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF21CC433F5 for ; Tue, 5 Apr 2022 01:57:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 155E76B008A; Mon, 4 Apr 2022 21:49:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1055A6B008C; Mon, 4 Apr 2022 21:49:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EE9EE6B0092; Mon, 4 Apr 2022 21:49:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0161.hostedemail.com [216.40.44.161]) by kanga.kvack.org (Postfix) with ESMTP id E11416B008A for ; Mon, 4 Apr 2022 21:49:35 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id A0021ACF4B for ; Tue, 5 Apr 2022 01:49:25 +0000 (UTC) X-FDA: 79321143090.29.9DA087D Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf04.hostedemail.com (Postfix) with ESMTP id 150AB4000F for ; Tue, 5 Apr 2022 01:49:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649123364; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Zpk9Li6fgqH75i55vcn3N+1Xng1FM6T4OSs35V/WwSY=; b=DunQllFWGlr0eLPBPy1kM6bXSY9bLZjn7Q8MHAqB4jYJiaBo720yxWekRCIT4R9f+xMyS0 0EUD0k2bgrt/m7IPwtglpUXvU+si+mfLQWjtagLXRnHEsIR6rgbzY0P1/ZZoFqTnm61mAc IBzsk9iiv3bO5GeIM8gE1JRYUTNUyoU= Received: from mail-io1-f69.google.com (mail-io1-f69.google.com [209.85.166.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-629-yyPSnX3LPEWDQRaA-HGQSA-1; Mon, 04 Apr 2022 21:49:21 -0400 X-MC-Unique: yyPSnX3LPEWDQRaA-HGQSA-1 Received: by mail-io1-f69.google.com with SMTP id h10-20020a05660224ca00b0064c77aa4477so7426906ioe.17 for ; Mon, 04 Apr 2022 18:49:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Zpk9Li6fgqH75i55vcn3N+1Xng1FM6T4OSs35V/WwSY=; b=7BPozPY2pgeZv6jmnFwzKjQ05ashzKQZJ4z6z0JgwGR2mcQOcvRfqXCMGrKNnf3W1/ FWj3+jVRz1apUtvF0QYXKZto5Lpi8J8DOcrjRuaNGD7oLUD9qD9PB7pz8aiaVZZK//92 4JpCrCWPS5OvMGYxWv869pthNVuC5n+AJBqFuMR9CyvEML/Fy7UP3+WDFsFrhEG2htTM rePXsWDgCDHuO5C5G9XZCQRjNc6pBPjPAJ/dzSOdvDF62t4C+pP6/tRLz0xA+2cnwTru ea/AU4iZrFE6hSdbgJRhtCKRMW27vryQC0W9Dat7Zx1Z3pKiVlOSIc1hh9gHv0wjied8 Mdjg== X-Gm-Message-State: AOAM530r6yZXd25Xx0HTrEheBaD1opOJ6FV3K2Db2S/CIYYHsOdhAyXu r0BAudVFD/xSmMJzzdLxmIvsdVxMT2y0rD2b6WElWR0dL8Ff2eRFVbGtxnm+K4ux02iNabEKYaK YLP5UJmXIMiw= X-Received: by 2002:a05:6638:164b:b0:323:ac42:8d4b with SMTP id a11-20020a056638164b00b00323ac428d4bmr693508jat.75.1649123361026; Mon, 04 Apr 2022 18:49:21 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzDhrnP+UnjED6dnxAUMKpLm5xwk/WcwG2E8d8U2B2Tf/WxM+ICv/y59UNBKP9S5t8sm4pR1A== X-Received: by 2002:a05:6638:164b:b0:323:ac42:8d4b with SMTP id a11-20020a056638164b00b00323ac428d4bmr693489jat.75.1649123360806; Mon, 04 Apr 2022 18:49:20 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id t1-20020a056e02060100b002ca41adce5dsm2355369ils.8.2022.04.04.18.49.19 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Apr 2022 18:49:20 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz , Nadav Amit , Matthew Wilcox , Mike Rapoport , David Hildenbrand , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Andrea Arcangeli , Andrew Morton , Axel Rasmussen , Alistair Popple , peterx@redhat.com Subject: [PATCH v8 18/23] mm/hugetlb: Handle uffd-wp during fork() Date: Mon, 4 Apr 2022 21:49:18 -0400 Message-Id: <20220405014918.14932-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220405014646.13522-1-peterx@redhat.com> References: <20220405014646.13522-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 150AB4000F X-Rspam-User: Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=DunQllFW; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf04.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com X-Stat-Signature: gdz773eshant6ocy1b93kaykiaqrxb3j X-HE-Tag: 1649123364-793004 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Firstly, we'll need to pass in dst_vma into copy_hugetlb_page_range() because for uffd-wp it's the dst vma that matters on deciding how we should treat uffd-wp protected ptes. We should recognize pte markers during fork and do the pte copy if needed. Signed-off-by: Peter Xu --- include/linux/hugetlb.h | 7 +++++-- mm/hugetlb.c | 42 +++++++++++++++++++++++++++-------------- mm/memory.c | 2 +- 3 files changed, 34 insertions(+), 17 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ab48b3bbb0e6..6df51d23b7ee 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -137,7 +137,8 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma, struct vm_area_struct *new_vma, unsigned long old_addr, unsigned long new_addr, unsigned long len); -int copy_hugetlb_page_range(struct mm_struct *, struct mm_struct *, struct vm_area_struct *); +int copy_hugetlb_page_range(struct mm_struct *, struct mm_struct *, + struct vm_area_struct *, struct vm_area_struct *); long follow_hugetlb_page(struct mm_struct *, struct vm_area_struct *, struct page **, struct vm_area_struct **, unsigned long *, unsigned long *, long, unsigned int, @@ -268,7 +269,9 @@ static inline struct page *follow_huge_addr(struct mm_struct *mm, } static inline int copy_hugetlb_page_range(struct mm_struct *dst, - struct mm_struct *src, struct vm_area_struct *vma) + struct mm_struct *src, + struct vm_area_struct *dst_vma, + struct vm_area_struct *src_vma) { BUG(); return 0; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e4af8b357b90..e1571179698a 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4706,23 +4706,24 @@ hugetlb_install_page(struct vm_area_struct *vma, pte_t *ptep, unsigned long addr } int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, - struct vm_area_struct *vma) + struct vm_area_struct *dst_vma, + struct vm_area_struct *src_vma) { pte_t *src_pte, *dst_pte, entry, dst_entry; struct page *ptepage; unsigned long addr; - bool cow = is_cow_mapping(vma->vm_flags); - struct hstate *h = hstate_vma(vma); + bool cow = is_cow_mapping(src_vma->vm_flags); + struct hstate *h = hstate_vma(src_vma); unsigned long sz = huge_page_size(h); unsigned long npages = pages_per_huge_page(h); - struct address_space *mapping = vma->vm_file->f_mapping; + struct address_space *mapping = src_vma->vm_file->f_mapping; struct mmu_notifier_range range; int ret = 0; if (cow) { - mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, src, - vma->vm_start, - vma->vm_end); + mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, src_vma, src, + src_vma->vm_start, + src_vma->vm_end); mmu_notifier_invalidate_range_start(&range); mmap_assert_write_locked(src); raw_write_seqcount_begin(&src->write_protect_seq); @@ -4736,12 +4737,12 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, i_mmap_lock_read(mapping); } - for (addr = vma->vm_start; addr < vma->vm_end; addr += sz) { + for (addr = src_vma->vm_start; addr < src_vma->vm_end; addr += sz) { spinlock_t *src_ptl, *dst_ptl; src_pte = huge_pte_offset(src, addr, sz); if (!src_pte) continue; - dst_pte = huge_pte_alloc(dst, vma, addr, sz); + dst_pte = huge_pte_alloc(dst, dst_vma, addr, sz); if (!dst_pte) { ret = -ENOMEM; break; @@ -4776,6 +4777,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, } else if (unlikely(is_hugetlb_entry_migration(entry) || is_hugetlb_entry_hwpoisoned(entry))) { swp_entry_t swp_entry = pte_to_swp_entry(entry); + bool uffd_wp = huge_pte_uffd_wp(entry); if (!is_readable_migration_entry(swp_entry) && cow) { /* @@ -4785,10 +4787,21 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, swp_entry = make_readable_migration_entry( swp_offset(swp_entry)); entry = swp_entry_to_pte(swp_entry); + if (userfaultfd_wp(src_vma) && uffd_wp) + entry = huge_pte_mkuffd_wp(entry); set_huge_swap_pte_at(src, addr, src_pte, entry, sz); } + if (!userfaultfd_wp(dst_vma) && uffd_wp) + entry = huge_pte_clear_uffd_wp(entry); set_huge_swap_pte_at(dst, addr, dst_pte, entry, sz); + } else if (unlikely(is_pte_marker(entry))) { + /* + * We copy the pte marker only if the dst vma has + * uffd-wp enabled. + */ + if (userfaultfd_wp(dst_vma)) + set_huge_pte_at(dst, addr, dst_pte, entry); } else { entry = huge_ptep_get(src_pte); ptepage = pte_page(entry); @@ -4806,20 +4819,21 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, */ if (!PageAnon(ptepage)) { page_dup_file_rmap(ptepage, true); - } else if (page_try_dup_anon_rmap(ptepage, true, vma)) { + } else if (page_try_dup_anon_rmap(ptepage, true, + src_vma)) { pte_t src_pte_old = entry; struct page *new; spin_unlock(src_ptl); spin_unlock(dst_ptl); /* Do not use reserve as it's private owned */ - new = alloc_huge_page(vma, addr, 1); + new = alloc_huge_page(dst_vma, addr, 1); if (IS_ERR(new)) { put_page(ptepage); ret = PTR_ERR(new); break; } - copy_user_huge_page(new, ptepage, addr, vma, + copy_user_huge_page(new, ptepage, addr, dst_vma, npages); put_page(ptepage); @@ -4829,13 +4843,13 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); entry = huge_ptep_get(src_pte); if (!pte_same(src_pte_old, entry)) { - restore_reserve_on_error(h, vma, addr, + restore_reserve_on_error(h, dst_vma, addr, new); put_page(new); /* dst_entry won't change as in child */ goto again; } - hugetlb_install_page(vma, dst_pte, addr, new); + hugetlb_install_page(dst_vma, dst_pte, addr, new); spin_unlock(src_ptl); spin_unlock(dst_ptl); continue; diff --git a/mm/memory.c b/mm/memory.c index 9808edfe18d4..d1e9c2517dfb 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1278,7 +1278,7 @@ copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma) return 0; if (is_vm_hugetlb_page(src_vma)) - return copy_hugetlb_page_range(dst_mm, src_mm, src_vma); + return copy_hugetlb_page_range(dst_mm, src_mm, dst_vma, src_vma); if (unlikely(src_vma->vm_flags & VM_PFNMAP)) { /* From patchwork Tue Apr 5 01:49:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12800975 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC08FC433F5 for ; Tue, 5 Apr 2022 01:57:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A08476B008C; Mon, 4 Apr 2022 21:49:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9B7996B0092; Mon, 4 Apr 2022 21:49:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 858496B0093; Mon, 4 Apr 2022 21:49:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.27]) by kanga.kvack.org (Postfix) with ESMTP id 790C16B008C for ; Mon, 4 Apr 2022 21:49:36 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 4676AB08 for ; Tue, 5 Apr 2022 01:49:26 +0000 (UTC) X-FDA: 79321143132.03.F522F0B Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf27.hostedemail.com (Postfix) with ESMTP id C4FC640020 for ; Tue, 5 Apr 2022 01:49:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649123365; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bWfhkJFq1BCdKqisyymaNspfijHKW5rO8LQgLYY4Yk0=; b=bR5JfXtgBKqNHxNxgEslZ5nxiYhg6oyPfyrM2bCTjxbGKEkA/mV7lJWwHDMecvdFcI4bKM maunlIwtkebVHiIrZ2fLPToEEitUc+h+UYiFcV5Ra0wC1/jlqRblXZDedfK95ThtnTxiZ4 d5cx654q2ARfC0POkG3pXYOshxkq8j8= Received: from mail-il1-f200.google.com (mail-il1-f200.google.com [209.85.166.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-669-8fNBt3XJM-OwqTZdecq6Fg-1; Mon, 04 Apr 2022 21:49:24 -0400 X-MC-Unique: 8fNBt3XJM-OwqTZdecq6Fg-1 Received: by mail-il1-f200.google.com with SMTP id q11-20020a056e02106b00b002ca4677e013so2311905ilj.10 for ; Mon, 04 Apr 2022 18:49:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=bWfhkJFq1BCdKqisyymaNspfijHKW5rO8LQgLYY4Yk0=; b=Hmyt4Gb6+J+NXkEsjJalwgrxSmYjjzM5lPC5XmLuEhiELF3C2BKDSOPH7oCvd+O23u Cne06Y5N/YQCPeyal6whEU94ujQOaa6rNCCX20+v3Yt26f7vUT9LHrDh5WRBVBWbh/cV YydSw4F3RKNO938jlm31uDao8jR5rS3aOAVcVYzbg5HQ2Z99BVianUBUl2/2ER9XwfMn 0m/JsxdHkoTtLZ3WBjXP2UwsPfFYHHInkmgHQU8K3hsSRp8D5JsxpH1PPYcmV2mCQnZj B9eoS7RFCBWooM9VOLIwkvpNhpeYl6tGJ/ywNP2jNuryZvzi5Y/yehfB6rFaMdXGdXze LAcA== X-Gm-Message-State: AOAM5317FEb7H1P2mM4y5VjNsm6BQa9FSgEL8Ewwq1kGlzG34WY6Z2j1 lngGJIITc1TuRfLzoqKhD94jKxCe85iPYGWnBp76Da0I+ey8DSytVFO1kJBLmN3UO/P4fUMbxGC rzgZSPfbrUaU= X-Received: by 2002:a02:cd12:0:b0:321:29bd:b5ae with SMTP id g18-20020a02cd12000000b0032129bdb5aemr659608jaq.83.1649123363626; Mon, 04 Apr 2022 18:49:23 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwFo84QPKUst3NxwRFnpoGwYsq1cNnk6B7cOmPSPQoeut9h5XdJGOgIWmVZYMgQ4peNHpQKkw== X-Received: by 2002:a02:cd12:0:b0:321:29bd:b5ae with SMTP id g18-20020a02cd12000000b0032129bdb5aemr659588jaq.83.1649123363425; Mon, 04 Apr 2022 18:49:23 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id z12-20020a92d18c000000b002ca3ac378e2sm2852863ilz.76.2022.04.04.18.49.22 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Apr 2022 18:49:23 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz , Nadav Amit , Matthew Wilcox , Mike Rapoport , David Hildenbrand , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Andrea Arcangeli , Andrew Morton , Axel Rasmussen , Alistair Popple , peterx@redhat.com Subject: [PATCH v8 19/23] mm/khugepaged: Don't recycle vma pgtable if uffd-wp registered Date: Mon, 4 Apr 2022 21:49:21 -0400 Message-Id: <20220405014921.14994-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220405014646.13522-1-peterx@redhat.com> References: <20220405014646.13522-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspam-User: Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=bR5JfXtg; spf=none (imf27.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: C4FC640020 X-Stat-Signature: 4jxqiucdytanou6kiphqaxjz7fdajgfz X-HE-Tag: 1649123365-458905 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When we're trying to collapse a 2M huge shmem page, don't retract pgtable pmd page if it's registered with uffd-wp, because that pgtable could have pte markers installed. Recycling of that pgtable means we'll lose the pte markers. That could cause data loss for an uffd-wp enabled application on shmem. Instead of disabling khugepaged on these files, simply skip retracting these special VMAs, then the page cache can still be merged into a huge thp, and other mm/vma can still map the range of file with a huge thp when proper. Note that checking VM_UFFD_WP needs to be done with mmap_sem held for write, that avoids race like: khugepaged user thread ========== =========== check VM_UFFD_WP, not set UFFDIO_REGISTER with uffd-wp on shmem wr-protect some pages (install markers) take mmap_sem write lock erase pmd and free pmd page --> pte markers are dropped unnoticed! Signed-off-by: Peter Xu --- mm/khugepaged.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 04a972259136..d7c5bb9fd1fb 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1464,6 +1464,10 @@ void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr) if (!hugepage_vma_check(vma, vma->vm_flags | VM_HUGEPAGE)) return; + /* Keep pmd pgtable for uffd-wp; see comment in retract_page_tables() */ + if (userfaultfd_wp(vma)) + return; + hpage = find_lock_page(vma->vm_file->f_mapping, linear_page_index(vma, haddr)); if (!hpage) @@ -1599,7 +1603,15 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) * reverse order. Trylock is a way to avoid deadlock. */ if (mmap_write_trylock(mm)) { - if (!khugepaged_test_exit(mm)) + /* + * When a vma is registered with uffd-wp, we can't + * recycle the pmd pgtable because there can be pte + * markers installed. Skip it only, so the rest mm/vma + * can still have the same file mapped hugely, however + * it'll always mapped in small page size for uffd-wp + * registered ranges. + */ + if (!khugepaged_test_exit(mm) && !userfaultfd_wp(vma)) collapse_and_free_pmd(mm, vma, addr, pmd); mmap_write_unlock(mm); } else { From patchwork Tue Apr 5 01:49:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12800976 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7DC8BC433EF for ; Tue, 5 Apr 2022 01:58:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7243E6B0092; Mon, 4 Apr 2022 21:49:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6D1C56B0093; Mon, 4 Apr 2022 21:49:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 572F36B0095; Mon, 4 Apr 2022 21:49:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.a.hostedemail.com [64.99.140.24]) by kanga.kvack.org (Postfix) with ESMTP id 48C536B0092 for ; Mon, 4 Apr 2022 21:49:39 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 23A79230ED for ; Tue, 5 Apr 2022 01:49:29 +0000 (UTC) X-FDA: 79321143258.11.935563E Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf19.hostedemail.com (Postfix) with ESMTP id A468E1A0007 for ; Tue, 5 Apr 2022 01:49:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649123368; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kYLbDoazIMJSeMNhJ5zt7FEdCu3V+kUqTKXo5M6GV7A=; b=ffku4zq6o80U0U8REoBh++LK7eoeaJIaMYmYaIqCANc4lZrCRVrWYFE5Fi+QaylXojCmVa x6dtmJLF2VGX6IorklobdMjY3/iNJ0vAIb/lw1McJpnB58Ff5XgXRR4IJaYxrTogqhsJmq PBICQkwFeZf3L0rneRQkwwTr/Rp2D1g= Received: from mail-io1-f70.google.com (mail-io1-f70.google.com [209.85.166.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-183-nHgLutAZNT2thDbBk8QRQQ-1; Mon, 04 Apr 2022 21:49:27 -0400 X-MC-Unique: nHgLutAZNT2thDbBk8QRQQ-1 Received: by mail-io1-f70.google.com with SMTP id g16-20020a05660203d000b005f7b3b0642eso7397912iov.16 for ; Mon, 04 Apr 2022 18:49:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=kYLbDoazIMJSeMNhJ5zt7FEdCu3V+kUqTKXo5M6GV7A=; b=fZsSTBMh4xpNVs4Pnhl7EI0G9E10XShubRenpJq7Gn94SOCgzuCokUNKmFLwGNJoNP cdP6bZPX3yOJ8ToMjkFqrzeNbtHEbKSULG45JLINd/NFZd3t1uma2wkxTOYhVZ0HHFRr OtWl1WVSJbnWfzQ+c+TlXOmAQd6F+nCycxEHaz6LjCdx/nl7pK3TLh3mgSf0aZc7UuKH XouSNdcj2g71U+M1u863o/+TKUMFPNRICd+AdP4gwtIFBddP2JO9rIGri2Zxq8z9y64J y24kt+bM9fGbJM/ivTESR0yX1ZDNfr8qQSFbiYzg20vZPXyocHjeLsKvOLYERzUo7vku Ekqw== X-Gm-Message-State: AOAM533Sw4qsw0iMY4ac3fzmKDPDn/RPfExPM1aDhUROA+AguVuOJLRk HXSyaugvhiTHiURVvzPDM0XpnIEzmyK/oMmawdCep38wo+4mjnfjAFPUol4we7r9ojPYTciFh9q XIU3xiuFBjoE= X-Received: by 2002:a05:6602:2c0d:b0:60f:6ac8:ad05 with SMTP id w13-20020a0566022c0d00b0060f6ac8ad05mr539101iov.175.1649123366484; Mon, 04 Apr 2022 18:49:26 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyIt71qNBgRyuVjW7pMNtAgATJ9MNf0M9CvhTW16IXYsjPRDdAvOlDVVqXJGfZlYTfnSe8R1g== X-Received: by 2002:a05:6602:2c0d:b0:60f:6ac8:ad05 with SMTP id w13-20020a0566022c0d00b0060f6ac8ad05mr539087iov.175.1649123366302; Mon, 04 Apr 2022 18:49:26 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id m9-20020a0566022ac900b0064cf3d9f35fsm2767620iov.35.2022.04.04.18.49.25 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Apr 2022 18:49:26 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz , Nadav Amit , Matthew Wilcox , Mike Rapoport , David Hildenbrand , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Andrea Arcangeli , Andrew Morton , Axel Rasmussen , Alistair Popple , peterx@redhat.com Subject: [PATCH v8 20/23] mm/pagemap: Recognize uffd-wp bit for shmem/hugetlbfs Date: Mon, 4 Apr 2022 21:49:23 -0400 Message-Id: <20220405014923.15047-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220405014646.13522-1-peterx@redhat.com> References: <20220405014646.13522-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: A468E1A0007 X-Stat-Signature: hga761brttkoehrb544tkpegamydysqa X-Rspam-User: Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ffku4zq6; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf19.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com X-HE-Tag: 1649123368-800099 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This requires the pagemap code to be able to recognize the newly introduced swap special pte for uffd-wp, meanwhile the general case for hugetlb that we recently start to support. It should make pagemap uffd-wp support complete. Signed-off-by: Peter Xu --- fs/proc/task_mmu.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index f46060eb91b5..194dfd7abf2b 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1421,6 +1421,8 @@ static pagemap_entry_t pte_to_pagemap_entry(struct pagemapread *pm, migration = is_migration_entry(entry); if (is_pfn_swap_entry(entry)) page = pfn_swap_entry_to_page(entry); + if (pte_marker_entry_uffd_wp(entry)) + flags |= PM_UFFD_WP; } if (page && !PageAnon(page)) @@ -1556,10 +1558,15 @@ static int pagemap_hugetlb_range(pte_t *ptep, unsigned long hmask, if (page_mapcount(page) == 1) flags |= PM_MMAP_EXCLUSIVE; + if (huge_pte_uffd_wp(pte)) + flags |= PM_UFFD_WP; + flags |= PM_PRESENT; if (pm->show_pfn) frame = pte_pfn(pte) + ((addr & ~hmask) >> PAGE_SHIFT); + } else if (pte_swp_uffd_wp_any(pte)) { + flags |= PM_UFFD_WP; } for (; addr != end; addr += PAGE_SIZE) { From patchwork Tue Apr 5 01:49:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12800977 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A2AAC433EF for ; Tue, 5 Apr 2022 01:58:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8AA656B0093; Mon, 4 Apr 2022 21:49:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8319E6B0095; Mon, 4 Apr 2022 21:49:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 60FBB6B0096; Mon, 4 Apr 2022 21:49:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.a.hostedemail.com [64.99.140.24]) by kanga.kvack.org (Postfix) with ESMTP id 544FA6B0093 for ; Mon, 4 Apr 2022 21:49:42 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 24B0220394 for ; Tue, 5 Apr 2022 01:49:32 +0000 (UTC) X-FDA: 79321143384.11.2C48A57 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf01.hostedemail.com (Postfix) with ESMTP id 8C4B240021 for ; Tue, 5 Apr 2022 01:49:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649123371; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vOyz+2xZD3mxaBpxdZs8qkqfXmUFPooQ6m/cX5syVEc=; b=RuO0qMZrVgFO9BS8HWGIeYgPnOl6derccv5fiUbLTQrJy/2GROnobfF+A2MNh5dSUkki4n VESNyhHgChHqJ8i/zTPTk84ET59zEQdRx1k0Tki6dvkoawQGVKDjQGqfYXwogdWy+rlqUF 9Mv3YSqeI/2ri8TQrXGBbNJLCcqJHQ4= Received: from mail-io1-f72.google.com (mail-io1-f72.google.com [209.85.166.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-241-34jatZLGMRSJ-XD7hnN87A-1; Mon, 04 Apr 2022 21:49:30 -0400 X-MC-Unique: 34jatZLGMRSJ-XD7hnN87A-1 Received: by mail-io1-f72.google.com with SMTP id f11-20020a056602070b00b00645d08010fcso7433099iox.15 for ; Mon, 04 Apr 2022 18:49:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=vOyz+2xZD3mxaBpxdZs8qkqfXmUFPooQ6m/cX5syVEc=; b=bfBvQWCQHGpPMQSKxmIp+rk+QsWh+7vDqGBr8P0kOGKetgXgodAHpaOgw0K+mAvQKY LTthNt7M/ygD4eRJbhmvHa9MOq8jFr3faohqOstSNepD1QtXjGXWoXoIr5G8qzYxogt6 B6zpjLoXGLdCcpdpyrZTD0cZwAJwcYOyM+3pwI4lJoJoQr+WMmgwTppo2dUz4V7/oqxz QVvc7Qy8vsTBHdTLda9V/H7I7ruavdKZxb+MxRlordolZkgJzGiRb3eibnbLa6oL3QD/ 0edBYqUF+yuaVvvVFubUzuSe9xhp72Blxo2aHyVm5eEsqwIs9ffegkxqezWIORTJzvVf hpog== X-Gm-Message-State: AOAM532m1iQhkqRF64UAD7mXrLVUtT7b1k/BOglw6TAtpb7nW45MjHob yo1cP6p56Xmw1xrGnxzoN/0KpU/jBYLM7Z47dHtSsil/Vi2FIVSSHzV1hfs6/HboC/d1gI0UN4G h77794b2LTSA= X-Received: by 2002:a05:6e02:156b:b0:2c7:bea3:4e3f with SMTP id k11-20020a056e02156b00b002c7bea34e3fmr552948ilu.297.1649123369511; Mon, 04 Apr 2022 18:49:29 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzTG3iAZemJX0eJmfE0/JA/EKpfb5j6PGqOBIGRnudHQh5V4DM77svm07A1N03QJ/sTzCyG+Q== X-Received: by 2002:a05:6e02:156b:b0:2c7:bea3:4e3f with SMTP id k11-20020a056e02156b00b002c7bea34e3fmr552932ilu.297.1649123369224; Mon, 04 Apr 2022 18:49:29 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id s12-20020a92cbcc000000b002bd04428740sm6652376ilq.80.2022.04.04.18.49.27 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Apr 2022 18:49:29 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz , Nadav Amit , Matthew Wilcox , Mike Rapoport , David Hildenbrand , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Andrea Arcangeli , Andrew Morton , Axel Rasmussen , Alistair Popple , peterx@redhat.com Subject: [PATCH v8 21/23] mm/uffd: Enable write protection for shmem & hugetlbfs Date: Mon, 4 Apr 2022 21:49:26 -0400 Message-Id: <20220405014926.15101-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220405014646.13522-1-peterx@redhat.com> References: <20220405014646.13522-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 8C4B240021 X-Stat-Signature: 84qd3fxfs81uu3adibxk78piu1qjrmhp Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=RuO0qMZr; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf01.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=peterx@redhat.com X-Rspam-User: X-HE-Tag: 1649123371-757277 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We've had all the necessary changes ready for both shmem and hugetlbfs. Turn on all the shmem/hugetlbfs switches for userfaultfd-wp. We can expand UFFD_API_RANGE_IOCTLS_BASIC with _UFFDIO_WRITEPROTECT too because all existing types now support write protection mode. Since vma_can_userfault() will be used elsewhere, move into userfaultfd_k.h. Signed-off-by: Peter Xu --- fs/userfaultfd.c | 21 +++------------------ include/linux/userfaultfd_k.h | 20 ++++++++++++++++++++ include/uapi/linux/userfaultfd.h | 10 ++++++++-- mm/userfaultfd.c | 9 +++------ 4 files changed, 34 insertions(+), 26 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 8b4a94f5a238..fb45522a2b44 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1257,24 +1257,6 @@ static __always_inline int validate_range(struct mm_struct *mm, return 0; } -static inline bool vma_can_userfault(struct vm_area_struct *vma, - unsigned long vm_flags) -{ - /* FIXME: add WP support to hugetlbfs and shmem */ - if (vm_flags & VM_UFFD_WP) { - if (is_vm_hugetlb_page(vma) || vma_is_shmem(vma)) - return false; - } - - if (vm_flags & VM_UFFD_MINOR) { - if (!(is_vm_hugetlb_page(vma) || vma_is_shmem(vma))) - return false; - } - - return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) || - vma_is_shmem(vma); -} - static int userfaultfd_register(struct userfaultfd_ctx *ctx, unsigned long arg) { @@ -1955,6 +1937,9 @@ static int userfaultfd_api(struct userfaultfd_ctx *ctx, #endif #ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP uffdio_api.features &= ~UFFD_FEATURE_PAGEFAULT_FLAG_WP; +#endif +#ifndef CONFIG_PTE_MARKER_UFFD_WP + uffdio_api.features &= ~UFFD_FEATURE_WP_HUGETLBFS_SHMEM; #endif uffdio_api.ioctls = UFFD_API_IOCTLS; ret = -EFAULT; diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 827e38b7be65..ea11bed9bb7e 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -18,6 +18,7 @@ #include #include #include +#include /* The set of all possible UFFD-related VM flags. */ #define __VM_UFFD_FLAGS (VM_UFFD_MISSING | VM_UFFD_WP | VM_UFFD_MINOR) @@ -140,6 +141,25 @@ static inline bool userfaultfd_armed(struct vm_area_struct *vma) return vma->vm_flags & __VM_UFFD_FLAGS; } +static inline bool vma_can_userfault(struct vm_area_struct *vma, + unsigned long vm_flags) +{ + if (vm_flags & VM_UFFD_MINOR) + return is_vm_hugetlb_page(vma) || vma_is_shmem(vma); + +#ifndef CONFIG_PTE_MARKER_UFFD_WP + /* + * If user requested uffd-wp but not enabled pte markers for + * uffd-wp, then shmem & hugetlbfs are not supported but only + * anonymous. + */ + if ((vm_flags & VM_UFFD_WP) && !vma_is_anonymous(vma)) + return false; +#endif + return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) || + vma_is_shmem(vma); +} + extern int dup_userfaultfd(struct vm_area_struct *, struct list_head *); extern void dup_userfaultfd_complete(struct list_head *); diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index ef739054cb1c..7d32b1e797fb 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -33,7 +33,8 @@ UFFD_FEATURE_THREAD_ID | \ UFFD_FEATURE_MINOR_HUGETLBFS | \ UFFD_FEATURE_MINOR_SHMEM | \ - UFFD_FEATURE_EXACT_ADDRESS) + UFFD_FEATURE_EXACT_ADDRESS | \ + UFFD_FEATURE_WP_HUGETLBFS_SHMEM) #define UFFD_API_IOCTLS \ ((__u64)1 << _UFFDIO_REGISTER | \ (__u64)1 << _UFFDIO_UNREGISTER | \ @@ -47,7 +48,8 @@ #define UFFD_API_RANGE_IOCTLS_BASIC \ ((__u64)1 << _UFFDIO_WAKE | \ (__u64)1 << _UFFDIO_COPY | \ - (__u64)1 << _UFFDIO_CONTINUE) + (__u64)1 << _UFFDIO_CONTINUE | \ + (__u64)1 << _UFFDIO_WRITEPROTECT) /* * Valid ioctl command number range with this API is from 0x00 to @@ -194,6 +196,9 @@ struct uffdio_api { * UFFD_FEATURE_EXACT_ADDRESS indicates that the exact address of page * faults would be provided and the offset within the page would not be * masked. + * + * UFFD_FEATURE_WP_HUGETLBFS_SHMEM indicates that userfaultfd + * write-protection mode is supported on both shmem and hugetlbfs. */ #define UFFD_FEATURE_PAGEFAULT_FLAG_WP (1<<0) #define UFFD_FEATURE_EVENT_FORK (1<<1) @@ -207,6 +212,7 @@ struct uffdio_api { #define UFFD_FEATURE_MINOR_HUGETLBFS (1<<9) #define UFFD_FEATURE_MINOR_SHMEM (1<<10) #define UFFD_FEATURE_EXACT_ADDRESS (1<<11) +#define UFFD_FEATURE_WP_HUGETLBFS_SHMEM (1<<12) __u64 features; __u64 ioctls; diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 58d67f2bf980..156e9bdf9f23 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -730,15 +730,12 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, err = -ENOENT; dst_vma = find_dst_vma(dst_mm, start, len); - /* - * Make sure the vma is not shared, that the dst range is - * both valid and fully within a single existing vma. - */ - if (!dst_vma || (dst_vma->vm_flags & VM_SHARED)) + + if (!dst_vma) goto out_unlock; if (!userfaultfd_wp(dst_vma)) goto out_unlock; - if (!vma_is_anonymous(dst_vma)) + if (!vma_can_userfault(dst_vma, dst_vma->vm_flags)) goto out_unlock; if (is_vm_hugetlb_page(dst_vma)) { From patchwork Tue Apr 5 01:49:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12800978 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EFCDDC433EF for ; Tue, 5 Apr 2022 01:59:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1CAF96B0095; Mon, 4 Apr 2022 21:49:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 17AC06B0096; Mon, 4 Apr 2022 21:49:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 01B826B0098; Mon, 4 Apr 2022 21:49:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.a.hostedemail.com [64.99.140.24]) by kanga.kvack.org (Postfix) with ESMTP id E942A6B0095 for ; Mon, 4 Apr 2022 21:49:44 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id C175120E74 for ; Tue, 5 Apr 2022 01:49:34 +0000 (UTC) X-FDA: 79321143468.08.D1FEDF9 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf11.hostedemail.com (Postfix) with ESMTP id 192C140018 for ; Tue, 5 Apr 2022 01:49:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649123373; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lW39+rLF3iuwrcyWFAu8xWUbxamG/E8iQp/9Q8XMaVI=; b=V/dnabp/rAeXFsiyLPjD+3v0/xCO0oJeockLtT0xSMIG+RMxbBAqm1jXk3aBaqmKj54wFx OBAfn8siLMrNbx52T3xyIfhett5AeoLk7FfdoNU8urQEOghCM4Tao+YZRph3yFimbSpLF5 MLloWbIPfzKTC9u6chQIeILzvf2Qx4E= Received: from mail-io1-f71.google.com (mail-io1-f71.google.com [209.85.166.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-382-OBAo6tB8OTuci3tvWylS1Q-1; Mon, 04 Apr 2022 21:49:32 -0400 X-MC-Unique: OBAo6tB8OTuci3tvWylS1Q-1 Received: by mail-io1-f71.google.com with SMTP id x16-20020a6bfe10000000b006409f03e39eso7463029ioh.7 for ; Mon, 04 Apr 2022 18:49:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=lW39+rLF3iuwrcyWFAu8xWUbxamG/E8iQp/9Q8XMaVI=; b=TlV8gAWNiXtEZh+9+5wUFaEp0bnxb+lKIhgIJEFW/A6N6Phj1DGt5Tckvjy1MIAxJM 2JtA9jHEUP+6H0cVa/AzmTFN88kEfCgNPjU6CGO8IPftL47KoCcam1goAFpkG5b2uiD2 aqA48WBr6ktjA0+sbnTiKhVXl6Y4EY07QZAsuScciIdE1w6Xy8VRVU8mTs3WuaW7b89j AjAqs0NMItP0AtpPgMsDaiE8IJAGtld6PSKasZTPqO45CNUq29GmG+4uVxUK66Gyzn4X glz8O3VQQDQsQH1TGcay8dskSqaTJiOF2BNyQpcBokmXQCWFqOCurvCMmTraYJTN5rvt GGag== X-Gm-Message-State: AOAM531Gyi7pHNuWah9Gylz5mSryk4QnDZtb0RS6MwcnhissJXR9np2N awab35kZDSMDF98M3cUnosuxv+st2PwXWusjXhwxrXKUf19jKquQwWDCnamPzxfd7IvCmsh1odg rns/joJLPOKI= X-Received: by 2002:a6b:cd0c:0:b0:649:adb8:79eb with SMTP id d12-20020a6bcd0c000000b00649adb879ebmr582209iog.138.1649123372063; Mon, 04 Apr 2022 18:49:32 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwn1jD8bZrilkR6TJdV2+HFVoMQ6KZBMcYF9DIorLbsiTZnX9G3AePuh+OKuYEiS2DRPtH+RA== X-Received: by 2002:a6b:cd0c:0:b0:649:adb8:79eb with SMTP id d12-20020a6bcd0c000000b00649adb879ebmr582198iog.138.1649123371881; Mon, 04 Apr 2022 18:49:31 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id b11-20020a92c56b000000b002c76a618f52sm6657231ilj.63.2022.04.04.18.49.30 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Apr 2022 18:49:31 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz , Nadav Amit , Matthew Wilcox , Mike Rapoport , David Hildenbrand , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Andrea Arcangeli , Andrew Morton , Axel Rasmussen , Alistair Popple , peterx@redhat.com Subject: [PATCH v8 22/23] mm: Enable PTE markers by default Date: Mon, 4 Apr 2022 21:49:29 -0400 Message-Id: <20220405014929.15158-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220405014646.13522-1-peterx@redhat.com> References: <20220405014646.13522-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 192C140018 X-Rspam-User: Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="V/dnabp/"; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf11.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com X-Stat-Signature: 9y37o3obtbquufmhebptcp4wcs71ipkk X-HE-Tag: 1649123373-11082 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Enable PTE markers by default. On x86_64 it means it'll auto-enable PTE_MARKER_UFFD_WP as well. Signed-off-by: Peter Xu --- mm/Kconfig | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/Kconfig b/mm/Kconfig index 6e7c2d59fa96..3eca34c864c5 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -911,12 +911,14 @@ config ANON_VMA_NAME config PTE_MARKER bool "Marker PTEs support" + default y help Allows to create marker PTEs for file-backed memory. config PTE_MARKER_UFFD_WP bool "Marker PTEs support for userfaultfd write protection" + default y depends on PTE_MARKER && HAVE_ARCH_USERFAULTFD_WP help From patchwork Tue Apr 5 01:49:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12800979 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 317ECC433F5 for ; Tue, 5 Apr 2022 01:59:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D64CE6B0096; Mon, 4 Apr 2022 21:49:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D141F6B0098; Mon, 4 Apr 2022 21:49:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BB4D56B0099; Mon, 4 Apr 2022 21:49:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0030.hostedemail.com [216.40.44.30]) by kanga.kvack.org (Postfix) with ESMTP id AD80B6B0096 for ; Mon, 4 Apr 2022 21:49:47 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 51F58ACF50 for ; Tue, 5 Apr 2022 01:49:37 +0000 (UTC) X-FDA: 79321143594.18.5096534 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf22.hostedemail.com (Postfix) with ESMTP id D23B3C001D for ; Tue, 5 Apr 2022 01:49:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649123376; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9fEneeyQr1XaNwKhMDQcOUUZbbgPBUxsqyNto4zx3Io=; b=crTcWsT0NyYuBfrbtsS2+n7VpKWvXtb0FIh0ubKJgDTo4FE335fzDZTYiahFaiG9ImCjZH ZOgEx0u0Xsjg59GpD8aGFIn9vLnXGpNT9m5c+tAgQBeL9Fa2+TICkwVfVJEqqnRLx1+87n uTd6cpXucMMLfw0LyiSvd8lyNV4e98c= Received: from mail-io1-f70.google.com (mail-io1-f70.google.com [209.85.166.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-392-8G63YEqmOh-l84zer9S3lg-1; Mon, 04 Apr 2022 21:49:35 -0400 X-MC-Unique: 8G63YEqmOh-l84zer9S3lg-1 Received: by mail-io1-f70.google.com with SMTP id i19-20020a5d9353000000b006495ab76af6so7508635ioo.0 for ; Mon, 04 Apr 2022 18:49:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=9fEneeyQr1XaNwKhMDQcOUUZbbgPBUxsqyNto4zx3Io=; b=m2shPMS12RbjRwnH29e+EtpdrMFk24/ucD0qDHm7+uVgjwWTPpfsdvPwuk/zewb/FJ v0f/mhywL20Vs+7iHY/fCJzBcRcnh7corDdO65tDXwVMg9f5COXEwJpliIBBP06FZWH/ 5Kc894JS6RS1deQUgJK/k4LAU0FHfQGyJxKh74ZkyihrYP0I8cwzLGu+FLwvZkSHtnuK d8VrDjGStDnBd44T+0VQnOamHc/4Xe58OxaAEMVfM3Ra5oaApMim19ZShyIq8t7zQ61U IEBUpeWePDS2vRun/1G7MoCCHNH1xEjpDYS9+BElJFvXyavcvEuYKUSqbu3eRsMMBgqj i57w== X-Gm-Message-State: AOAM531zgzcXImeADtBs86iO7pBOn6sNCQgYcdsb3yHLhGm0XU64kLdh xNcpu6qRfnu/PomuF4dRNSbvGHSKQsOyMW2PCbRNPszQOQ5hsRSyoRTyLzpMXH4vrDE4nd1SSKf TdZzxHHooOno= X-Received: by 2002:a92:2e01:0:b0:2ca:1f0d:fe5d with SMTP id v1-20020a922e01000000b002ca1f0dfe5dmr531917ile.201.1649123374741; Mon, 04 Apr 2022 18:49:34 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz6gsEsxVDLt+Q8uilC+AQSvDlTB1lbfOZtS3HSLlBkhdkp9NXI433lqCZ74NP5YsBF3yosQw== X-Received: by 2002:a92:2e01:0:b0:2ca:1f0d:fe5d with SMTP id v1-20020a922e01000000b002ca1f0dfe5dmr531898ile.201.1649123374449; Mon, 04 Apr 2022 18:49:34 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id s10-20020a6b740a000000b006413d13477dsm7272806iog.33.2022.04.04.18.49.33 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 04 Apr 2022 18:49:34 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Mike Kravetz , Nadav Amit , Matthew Wilcox , Mike Rapoport , David Hildenbrand , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Andrea Arcangeli , Andrew Morton , Axel Rasmussen , Alistair Popple , peterx@redhat.com Subject: [PATCH v8 23/23] selftests/uffd: Enable uffd-wp for shmem/hugetlbfs Date: Mon, 4 Apr 2022 21:49:32 -0400 Message-Id: <20220405014932.15212-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220405014646.13522-1-peterx@redhat.com> References: <20220405014646.13522-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Stat-Signature: pd1fkkzksu7czazni5te3mhatui6rcjr X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: D23B3C001D Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=crTcWsT0; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf22.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com X-Rspam-User: X-HE-Tag: 1649123376-63334 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: After we added support for shmem and hugetlbfs, we can turn uffd-wp test on always now. Signed-off-by: Peter Xu --- tools/testing/selftests/vm/userfaultfd.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c index 92a4516f8f0d..bbc4a6d8cf7b 100644 --- a/tools/testing/selftests/vm/userfaultfd.c +++ b/tools/testing/selftests/vm/userfaultfd.c @@ -82,7 +82,7 @@ static int test_type; static volatile bool test_uffdio_copy_eexist = true; static volatile bool test_uffdio_zeropage_eexist = true; /* Whether to test uffd write-protection */ -static bool test_uffdio_wp = false; +static bool test_uffdio_wp = true; /* Whether to test uffd minor faults */ static bool test_uffdio_minor = false; @@ -1594,8 +1594,6 @@ static void set_test_type(const char *type) if (!strcmp(type, "anon")) { test_type = TEST_ANON; uffd_test_ops = &anon_uffd_test_ops; - /* Only enable write-protect test for anonymous test */ - test_uffdio_wp = true; } else if (!strcmp(type, "hugetlb")) { test_type = TEST_HUGETLB; uffd_test_ops = &hugetlb_uffd_test_ops;