From patchwork Sat Aug 7 03:25:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12424097 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D437C4338F for ; Sat, 7 Aug 2021 03:25:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D187760F01 for ; Sat, 7 Aug 2021 03:25:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D187760F01 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id D1D976B006C; Fri, 6 Aug 2021 23:25:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CCDD36B0071; Fri, 6 Aug 2021 23:25:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B94AB6B0073; Fri, 6 Aug 2021 23:25:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0073.hostedemail.com [216.40.44.73]) by kanga.kvack.org (Postfix) with ESMTP id 9A1416B006C for ; Fri, 6 Aug 2021 23:25:28 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 4C45C18043080 for ; Sat, 7 Aug 2021 03:25:28 +0000 (UTC) X-FDA: 78446844336.01.BC4D6FB Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf06.hostedemail.com (Postfix) with ESMTP id C57298020868 for ; Sat, 7 Aug 2021 03:25:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1628306727; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=M89IpDmqpLb2E2sKYTfviaRjvA8qgBHjWuv99Q1xOu4=; b=Vyk3S7QeXj071n18qAuSnV+1RYiay25kRAzJjqDbeb+h08mf6rasViWowsj/EtvpIiLKaz 0RqKIXDmQnlh8s5KvuQgIMD9ZYXwqYiSAYnyGOXy/f0EZC5uWB8oa7IEOz695wEz1qnW1v rNNDRkYT7DxfLz5Wc+7sGq22R3rKYHI= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-354-c15-zlUsO_GKe0G0mJXnIw-1; Fri, 06 Aug 2021 23:25:26 -0400 X-MC-Unique: c15-zlUsO_GKe0G0mJXnIw-1 Received: by mail-qv1-f70.google.com with SMTP id g2-20020a0cdf020000b029033bc8be6d4aso1192928qvl.9 for ; Fri, 06 Aug 2021 20:25:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=M89IpDmqpLb2E2sKYTfviaRjvA8qgBHjWuv99Q1xOu4=; b=RNAtdQIIM3OgcBf13UrjvcnRqtKDF5ZmqjIhSZMzPBz0Z0UxUnKaXru5z5date5z6/ 56o+Vo84SdTqIGCikc5V7XhnQJNXeWJKHILa1shRAOakAxMcmhx0N9fxA7JMjyzZ+bVm BtaY5mxnnHEA5urwgrf6OkYZWvl9/VVAt2pogTFHGdyhX+KpENfhkO7BURGVNgNUbb3n HP4thgZL6wCWG9uCRg7/SYPpPlVORPyMcDj0WPaLV7+S0dKqCVlLXvAutST3Z0qvkYfP tYxCawyM7TTE22l5w7Z3jpEV/Xz46XXd+hmOwp4ZmvFL5Rj+Fzega3q7aFfIYvL15scL RpTA== X-Gm-Message-State: AOAM533pDveZM2jk9VuScLETHC4MhvnilO58MKtLjSojd/RdgEF4c/CU LcrTsMoEulxq/7u+xPez7xpOrz/UFnwerCL1QlYtQVBi46NFtpQ2jwrJsKmBeXD5hi7yYgl7oZh kgtOK3JuwVhk= X-Received: by 2002:a05:620a:20d3:: with SMTP id f19mr13477922qka.304.1628306725766; Fri, 06 Aug 2021 20:25:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwudqdogHpREEXwWEO5+43wkxQbUYyM7tOd7/XM0TXNcdPIMwjzcLULEbq1V0TujbxiOXrmmw== X-Received: by 2002:a05:620a:20d3:: with SMTP id f19mr13477901qka.304.1628306725555; Fri, 06 Aug 2021 20:25:25 -0700 (PDT) Received: from localhost.localdomain (bras-base-toroon474qw-grc-92-76-70-75-133.dsl.bell.ca. [76.70.75.133]) by smtp.gmail.com with ESMTPSA id a5sm5514875qkk.92.2021.08.06.20.25.24 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 06 Aug 2021 20:25:25 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Alistair Popple , Tiberiu Georgescu , ivan.teterevkov@nutanix.com, Mike Rapoport , Hugh Dickins , peterx@redhat.com, Matthew Wilcox , Andrea Arcangeli , David Hildenbrand , "Kirill A . Shutemov" , Andrew Morton , Mike Kravetz Subject: [PATCH RFC 1/4] mm: Introduce PTE_MARKER swap entry Date: Fri, 6 Aug 2021 23:25:18 -0400 Message-Id: <20210807032521.7591-2-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210807032521.7591-1-peterx@redhat.com> References: <20210807032521.7591-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Vyk3S7Qe; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf06.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com X-Stat-Signature: 51zxy1553echrur116ssuz9a5ii4nrbh X-Rspamd-Queue-Id: C57298020868 X-Rspamd-Server: rspam01 X-HE-Tag: 1628306727-289993 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch introduces a new swap entry type called PTE_MARKER. It can be installed for any pte that maps a file-backed memory when the pte is temporarily zapped, so as to maintain per-pte information. The information that kept in the pte is called a "marker". Here we define the marker as "unsigned long" just to match pgoff_t, however it will only work if it still fits in swp_offset(), which is e.g. currently 58 bits on x86_64. The first marker bit that is introduced together with the new swap pte is the PAGEOUT marker. When that bit is set, it means this pte used to point to a page which got swapped out. It's mostly a definition so the swap type is not totally nothing, however the functions are not implemented yet to handle the new swap type. A new config CONFIG_PTE_MARKER is introduced too; it's by default off. Signed-off-by: Peter Xu --- include/linux/swap.h | 14 ++++++++++++- include/linux/swapops.h | 45 +++++++++++++++++++++++++++++++++++++++++ mm/Kconfig | 17 ++++++++++++++++ 3 files changed, 75 insertions(+), 1 deletion(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index 6f5a43251593..545dc8e0b0fb 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -55,6 +55,18 @@ static inline int current_is_kswapd(void) * actions on faults. */ +/* + * PTE markers are used to persist information onto PTEs that are mapped with + * file-backed memories. + */ +#ifdef CONFIG_PTE_MARKER +#define SWP_PTE_MARKER_NUM 1 +#define SWP_PTE_MARKER (MAX_SWAPFILES + SWP_HWPOISON_NUM + \ + SWP_MIGRATION_NUM + SWP_DEVICE_NUM) +#else +#define SWP_PTE_MARKER_NUM 0 +#endif + /* * Unaddressable device memory support. See include/linux/hmm.h and * Documentation/vm/hmm.rst. Short description is we need struct pages for @@ -100,7 +112,7 @@ static inline int current_is_kswapd(void) #define MAX_SWAPFILES \ ((1 << MAX_SWAPFILES_SHIFT) - SWP_DEVICE_NUM - \ - SWP_MIGRATION_NUM - SWP_HWPOISON_NUM) + SWP_MIGRATION_NUM - SWP_HWPOISON_NUM - SWP_PTE_MARKER_NUM) /* * Magic header for a swap area. The first part of the union is diff --git a/include/linux/swapops.h b/include/linux/swapops.h index d356ab4047f7..3fec83449e1e 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -247,6 +247,51 @@ static inline int is_writable_migration_entry(swp_entry_t entry) #endif +#ifdef CONFIG_PTE_MARKER + +#ifdef CONFIG_PTE_MARKER_PAGEOUT +/* When this bit is set, it means this page is swapped out previously */ +#define PTE_MARKER_PAGEOUT (1UL << 0) +#else +#define PTE_MARKER_PAGEOUT 0 +#endif + +#define PTE_MARKER_MASK (PTE_MARKER_PAGEOUT) + +static inline swp_entry_t make_pte_marker_entry(unsigned long marker) +{ + return swp_entry(SWP_PTE_MARKER, marker); +} + +static inline bool is_pte_marker_entry(swp_entry_t entry) +{ + return swp_type(entry) == SWP_PTE_MARKER; +} + +static inline unsigned long pte_marker_get(swp_entry_t entry) +{ + return swp_offset(entry) & PTE_MARKER_MASK; +} + +#else /* CONFIG_PTE_MARKER */ + +static inline swp_entry_t make_pte_marker_entry(unsigned long marker) +{ + return swp_entry(0, 0); +} + +static inline bool is_pte_marker_entry(swp_entry_t entry) +{ + return false; +} + +static inline unsigned long pte_marker_get(swp_entry_t entry) +{ + return 0; +} + +#endif /* CONFIG_PTE_MARKER */ + static inline struct page *pfn_swap_entry_to_page(swp_entry_t entry) { struct page *p = pfn_to_page(swp_offset(entry)); diff --git a/mm/Kconfig b/mm/Kconfig index 40a9bfcd5062..6043d8f1c066 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -889,4 +889,21 @@ config IO_MAPPING config SECRETMEM def_bool ARCH_HAS_SET_DIRECT_MAP && !EMBEDDED +config PTE_MARKER + def_bool n + bool "Marker PTEs support" + + help + Allows to create marker PTEs for file-backed memory. + +config PTE_MARKER_PAGEOUT + def_bool n + depends on PTE_MARKER + bool "Shmem pagemap PM_SWAP support" + + help + Allows to create marker PTEs for file-backed memory when the page is + swapped out. It's required for pagemap to work correctly with shmem + on page swapping. + endmenu From patchwork Sat Aug 7 03:25:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12424101 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA5B7C4338F for ; Sat, 7 Aug 2021 03:25:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1A6F461185 for ; Sat, 7 Aug 2021 03:25:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 1A6F461185 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 94D8E6B0072; Fri, 6 Aug 2021 23:25:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8FD2C8D0001; Fri, 6 Aug 2021 23:25:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7776B6B0074; Fri, 6 Aug 2021 23:25:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0182.hostedemail.com [216.40.44.182]) by kanga.kvack.org (Postfix) with ESMTP id 56BB96B0072 for ; Fri, 6 Aug 2021 23:25:30 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 135268249980 for ; Sat, 7 Aug 2021 03:25:30 +0000 (UTC) X-FDA: 78446844420.23.D1082FD Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf02.hostedemail.com (Postfix) with ESMTP id 96F4E70037BC for ; Sat, 7 Aug 2021 03:25:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1628306729; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=L/1RwXysWQ1sqglJySJDDwULyThf17AJDB9rf0b/YTw=; b=bGX27SEnLTAuEM8/0SjraWlF17iCsNgxqEsBUgYNUWEN5UZUDtvhj/s7YLNxtCG9d4F2wO mG5Qkpj0N3d6kgBLg7csr95mkSppuxWP3Xm/Th6vWs4T1AqB4hlkeF4XUcJijT51zrZmnt 3PaLgQ0qwsLvlHbOyhj/Ckpny0Ctiqk= Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-147-YJV3eyi3ONqoVEFlMl5uDQ-1; Fri, 06 Aug 2021 23:25:27 -0400 X-MC-Unique: YJV3eyi3ONqoVEFlMl5uDQ-1 Received: by mail-qk1-f197.google.com with SMTP id h186-20020a37b7c30000b02903b914d9e335so7681359qkf.17 for ; Fri, 06 Aug 2021 20:25:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=L/1RwXysWQ1sqglJySJDDwULyThf17AJDB9rf0b/YTw=; b=tNQm3U2ql02/STyww/OBKV8gVJEDOXzo4ufxtZTGrM7zgRW7HWs0x/Vi5zwlPD9ADj QZuHZmIsER1yApYdgm7RNAIGcEGpQ0M6FvVN+FJC74cRnHQxia60MZZFNQMvm5EjTca0 h2e/FLjVdL9YeaNr/FVIgfnmyJjLyeIFRQpe6wR9leYtWzR4UD8wmokZMlh9/F5XfRV3 N6UYNdDyu/j8SjvT0DdcjP3pFnW7aqNLuxLqoErPaur19kCJSHbL2YdrzRrxIr40Ktsg GcXE5Twy4HBw82XtKRkLWc6hz5loddkO0Er8fqKWspGm7tsPJGtM/UQSMWq/23HHJjdQ SyrA== X-Gm-Message-State: AOAM530n2LjZLJtb2UIXAzBiMEie4bKSINvhK1cUKCKZ20MSP6k5YdAy y81mycGSU7/N6h0jHfYC22ySl7zReIf7/tCoUJjUyTRR/TD/pN8QVR8qgcLBikqlw5hfJSr11J8 3yGTuGYZSh+o= X-Received: by 2002:ac8:7292:: with SMTP id v18mr11780646qto.301.1628306727497; Fri, 06 Aug 2021 20:25:27 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyV9N7QxwGCkDas76nsjjYHmx//9jTApLA8tn9+Vvt7uhtdjfAKM+aH0uMld3+bwGot4UtMDA== X-Received: by 2002:ac8:7292:: with SMTP id v18mr11780632qto.301.1628306727286; Fri, 06 Aug 2021 20:25:27 -0700 (PDT) Received: from localhost.localdomain (bras-base-toroon474qw-grc-92-76-70-75-133.dsl.bell.ca. [76.70.75.133]) by smtp.gmail.com with ESMTPSA id a5sm5514875qkk.92.2021.08.06.20.25.25 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 06 Aug 2021 20:25:26 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Alistair Popple , Tiberiu Georgescu , ivan.teterevkov@nutanix.com, Mike Rapoport , Hugh Dickins , peterx@redhat.com, Matthew Wilcox , Andrea Arcangeli , David Hildenbrand , "Kirill A . Shutemov" , Andrew Morton , Mike Kravetz Subject: [PATCH RFC 2/4] mm: Check against orig_pte for finish_fault() Date: Fri, 6 Aug 2021 23:25:19 -0400 Message-Id: <20210807032521.7591-3-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210807032521.7591-1-peterx@redhat.com> References: <20210807032521.7591-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 96F4E70037BC Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=bGX27SEn; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf02.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 216.205.24.124) smtp.mailfrom=peterx@redhat.com X-Stat-Signature: e57azofn1cacq7ndqhqm4s1a1uj634y3 X-HE-Tag: 1628306729-506116 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We used to check against none pte and in those cases orig_pte should always be none pte anyway. This change prepares us to be able to call do_fault() on !none ptes. For example, we should allow that to happen for pte marker that has PAGEOUT set. Signed-off-by: Peter Xu --- mm/memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/memory.c b/mm/memory.c index 25fc46e87214..7288f585544a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4047,7 +4047,7 @@ vm_fault_t finish_fault(struct vm_fault *vmf) vmf->address, &vmf->ptl); ret = 0; /* Re-check under ptl */ - if (likely(pte_none(*vmf->pte))) + if (likely(pte_same(*vmf->pte, vmf->orig_pte))) do_set_pte(vmf, page, vmf->address); else ret = VM_FAULT_NOPAGE; From patchwork Sat Aug 7 03:25:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12424103 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89286C4320A for ; Sat, 7 Aug 2021 03:25:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 17477611C5 for ; Sat, 7 Aug 2021 03:25:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 17477611C5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 751176B0073; Fri, 6 Aug 2021 23:25:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7025E6B0074; Fri, 6 Aug 2021 23:25:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 57BD36B0075; Fri, 6 Aug 2021 23:25:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 347AE6B0073 for ; Fri, 6 Aug 2021 23:25:32 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id D481218043080 for ; Sat, 7 Aug 2021 03:25:31 +0000 (UTC) X-FDA: 78446844462.25.75CC92C Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf18.hostedemail.com (Postfix) with ESMTP id 22A16400A591 for ; Sat, 7 Aug 2021 03:25:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1628306730; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bjEVjVuqoZmR4E24RTtF6/Pc/e6BiHg9xZTg6exvfsA=; b=hJq9yjY9PB4GzF00cttOp9kX1Ylb3zFYhLjCZ4HFrFx/J0T0mIa2OY9JftUTmAxITTyYLh lJkXcph6e29uzEWcIy6cr0xj32LtQRJRqTujavGSY1rhbP2jY2RdLY9k2HvvAydKHaHUIG 0KoUglUdglZhiasrvKXhOeg02p4cPJ0= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-189-Fnd6socTNj-6nOKW5lPq2A-1; Fri, 06 Aug 2021 23:25:29 -0400 X-MC-Unique: Fnd6socTNj-6nOKW5lPq2A-1 Received: by mail-qv1-f69.google.com with SMTP id z25-20020a0ca9590000b029033ba243ffa1so7796906qva.0 for ; Fri, 06 Aug 2021 20:25:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=bjEVjVuqoZmR4E24RTtF6/Pc/e6BiHg9xZTg6exvfsA=; b=JcLpCrK1KHA06lC7D6BFzjDIvPEaHIG3sMe6hIq3Z8Nm1MHwV+zE6DYN4iWvuE5DpN x+qwpH6SzgJ6PqMs5txnUOkG1thL1SGbaAf7Lu3qOhwfYlFpPAhlxM/zM3V3nhog98J/ niaIvE8NlWfOEukRuE7rSIcJyOGZuo9CW/4h5MDi08DOUqOWOHL64fbEINsIpdnYAY8c V8vWso1JrU6VatixYC6hcLO9uhms1UQYpJa/HxkDT4qaacpG08t9Oeg5dexgis119tEn 59Vvf2yp3Aes/Vo+G73JcrLCB93iLtsQhRRzfZ7fNX5HNVfhgH9OM263CE9nuMMUQvio HehA== X-Gm-Message-State: AOAM530esnpcd0FXvpYIP2csae9egtStn+APb9D+CHe+adToE9Kay1Xf tfxnDD15rmWMzaPsWtkvdHqeusY628iPic9U9724qKbxEV/6CEOpk7rd7NKrNvLIFbtNIKqzWa+ PdQAtCNzesds= X-Received: by 2002:a05:620a:205e:: with SMTP id d30mr13550308qka.365.1628306728897; Fri, 06 Aug 2021 20:25:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzpGYk7qN5OVUrrQMIOkOmPcFH65ctEbWhaV9MLhg4CSFYW2Fa7e9x20fuo0S2Px9Hm+mTA0w== X-Received: by 2002:a05:620a:205e:: with SMTP id d30mr13550297qka.365.1628306728688; Fri, 06 Aug 2021 20:25:28 -0700 (PDT) Received: from localhost.localdomain (bras-base-toroon474qw-grc-92-76-70-75-133.dsl.bell.ca. [76.70.75.133]) by smtp.gmail.com with ESMTPSA id a5sm5514875qkk.92.2021.08.06.20.25.27 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 06 Aug 2021 20:25:28 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Alistair Popple , Tiberiu Georgescu , ivan.teterevkov@nutanix.com, Mike Rapoport , Hugh Dickins , peterx@redhat.com, Matthew Wilcox , Andrea Arcangeli , David Hildenbrand , "Kirill A . Shutemov" , Andrew Morton , Mike Kravetz Subject: [PATCH RFC 3/4] mm: Handle PTE_MARKER page faults Date: Fri, 6 Aug 2021 23:25:20 -0400 Message-Id: <20210807032521.7591-4-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210807032521.7591-1-peterx@redhat.com> References: <20210807032521.7591-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 22A16400A591 Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=hJq9yjY9; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf18.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 216.205.24.124) smtp.mailfrom=peterx@redhat.com X-Stat-Signature: abnrmxpp67dqyn9w6jruutt89c81ij6m X-HE-Tag: 1628306730-354638 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: handle_pte_marker() is the function that will parse and handle all the pte marker faults. For PAGEOUT marker, it's as simple as dropping the pte and do the fault just like a none pte. The other solution should be that we clear the pte to none pte and retry the fault, however that'll be slower than handling it right now. Signed-off-by: Peter Xu --- mm/memory.c | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+) diff --git a/mm/memory.c b/mm/memory.c index 7288f585544a..47f8ca064459 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -98,6 +98,8 @@ struct page *mem_map; EXPORT_SYMBOL(mem_map); #endif +static vm_fault_t do_fault(struct vm_fault *vmf); + /* * A number of key systems in x86 including ioremap() rely on the assumption * that high_memory defines the upper bound on direct map memory, then end @@ -1394,6 +1396,10 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, put_page(page); continue; + } else if (is_pte_marker_entry(entry)) { + /* Drop PTE_MARKER_PAGEOUT when zapped */ + pte_clear_not_present_full(mm, addr, pte, tlb->fullmm); + continue; } /* If details->check_mapping, we leave swap entries. */ @@ -3467,6 +3473,39 @@ static vm_fault_t remove_device_exclusive_entry(struct vm_fault *vmf) return 0; } +/* + * This function parses PTE markers and handle the faults. Returns true if we + * finished the fault, and we should have put the return value into "*ret". + * Otherwise it means we want to continue the swap path, and "*ret" untouched. + */ +static vm_fault_t handle_pte_marker(struct vm_fault *vmf) +{ + swp_entry_t entry = pte_to_swp_entry(vmf->orig_pte); + unsigned long marker; + + marker = pte_marker_get(entry); + + /* + * PTE markers should always be with file-backed memories, and the + * marker should never be empty. If anything weird happened, the best + * thing to do is to kill the process along with its mm. + */ + if (WARN_ON_ONCE(vma_is_anonymous(vmf->vma) || !marker)) + return VM_FAULT_SIGBUS; + +#ifdef CONFIG_PTE_MARKER_PAGEOUT + if (marker == PTE_MARKER_PAGEOUT) + /* + * This pte is previously zapped for swap, the PAGEOUT is only + * a flag before it's accessed again. Safe to drop it now. + */ + return do_fault(vmf); +#endif + + /* We see some marker that we can't handle */ + return VM_FAULT_SIGBUS; +} + /* * We enter with non-exclusive mmap_lock (to exclude vma changes, * but allow concurrent faults), and pte mapped but not yet locked. @@ -3503,6 +3542,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) ret = vmf->page->pgmap->ops->migrate_to_ram(vmf); } else if (is_hwpoison_entry(entry)) { ret = VM_FAULT_HWPOISON; + } else if (is_pte_marker_entry(entry)) { + ret = handle_pte_marker(vmf); } else { print_bad_pte(vma, vmf->address, vmf->orig_pte, NULL); ret = VM_FAULT_SIGBUS; From patchwork Sat Aug 7 03:25:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12424105 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60239C4338F for ; Sat, 7 Aug 2021 03:25:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EB4BE61186 for ; Sat, 7 Aug 2021 03:25:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org EB4BE61186 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 46D746B0074; Fri, 6 Aug 2021 23:25:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 41E936B0075; Fri, 6 Aug 2021 23:25:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2E57B6B0078; Fri, 6 Aug 2021 23:25:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0115.hostedemail.com [216.40.44.115]) by kanga.kvack.org (Postfix) with ESMTP id 09AB16B0074 for ; Fri, 6 Aug 2021 23:25:35 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id BB06B1D975 for ; Sat, 7 Aug 2021 03:25:34 +0000 (UTC) X-FDA: 78446844588.01.6B5B28B Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf02.hostedemail.com (Postfix) with ESMTP id 51DC270088DA for ; Sat, 7 Aug 2021 03:25:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1628306733; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OK+bQTiUipxL6Jk7xja/DmcKUDIieuX4Ee7RUXkwoH4=; b=XTs5ywsZ11ianwrs4VxU/wY7wBOlmdy4gviLhd0Oh1+fqfSAw/Y0dJnMo2Rl+svnzY8ZRD l+q6+AACLhtCQuYloSVcYGqtixWBC6++ybSlsv11MpzQaUZ4ngK8YUCgJ6REvu3oRhqgsI CiPdMvhGIOct6e5Ofo1+YOKk/GeTxVg= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-272-kXvlViTlO2qV5DRKjhFXCg-1; Fri, 06 Aug 2021 23:25:30 -0400 X-MC-Unique: kXvlViTlO2qV5DRKjhFXCg-1 Received: by mail-qv1-f70.google.com with SMTP id a2-20020a0562141302b02903303839b843so7751697qvv.13 for ; Fri, 06 Aug 2021 20:25:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=OK+bQTiUipxL6Jk7xja/DmcKUDIieuX4Ee7RUXkwoH4=; b=piko//2m0LcPRwUAe8EKnW9MV59e6sVPsVagc5CTs+BTpeFrqXB24HHWJzn3P41pBo oRwBJL4QGWgfyqIO5Wy3T2+Ttr3465IgxYMpQIckgtHyqxrKh79lHu42HkbUWwKfDzTs ColNgY+oDQjEte9zzu51/mX1NVSNzXm9ATM5k2Xoij+Rk0fxm8sOM26RUxPcK+W5++Mp GssEEgPo162OZ3EZ6nVdlc2E7GdShNTAJonjb2TIF7G+YN0nXYSwXYfP92vQvhjUAC6e ujQrR2ewdSyBv9aKL9dutHrEnE+AMPVki/YrFW3kTxXCLCghGZwY4VFSQLIFYhV8EXPu r4bw== X-Gm-Message-State: AOAM530+0RgryMz/j9qmNBssyNbWCp25Mc8LUMHYuWa6oOD0hPQdo49I IZ7vcTiSfrEBbFp4FIYcgb1l+yEdrZKlUqliRRhtNPs7zrJ21K1cHCSKrrRf+nQqJj9bxyOiOFC N2L3k+bQljtc= X-Received: by 2002:aed:2163:: with SMTP id 90mr11658369qtc.186.1628306730378; Fri, 06 Aug 2021 20:25:30 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzCK0I1XtqNK7P9nzzv0g1D1B3S1najqoV6hQLfL5gLFgAFA6yE5Kw0DBDbjYav/08UMiA3oA== X-Received: by 2002:aed:2163:: with SMTP id 90mr11658356qtc.186.1628306730149; Fri, 06 Aug 2021 20:25:30 -0700 (PDT) Received: from localhost.localdomain (bras-base-toroon474qw-grc-92-76-70-75-133.dsl.bell.ca. [76.70.75.133]) by smtp.gmail.com with ESMTPSA id a5sm5514875qkk.92.2021.08.06.20.25.28 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 06 Aug 2021 20:25:29 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Alistair Popple , Tiberiu Georgescu , ivan.teterevkov@nutanix.com, Mike Rapoport , Hugh Dickins , peterx@redhat.com, Matthew Wilcox , Andrea Arcangeli , David Hildenbrand , "Kirill A . Shutemov" , Andrew Morton , Mike Kravetz Subject: [PATCH RFC 4/4] mm: Install marker pte when page out for shmem pages Date: Fri, 6 Aug 2021 23:25:21 -0400 Message-Id: <20210807032521.7591-5-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210807032521.7591-1-peterx@redhat.com> References: <20210807032521.7591-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 51DC270088DA Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=XTs5ywsZ; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf02.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com X-Stat-Signature: d6kogqppp856g5c5pirktrouxjjsykh1 X-HE-Tag: 1628306734-927926 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When shmem pages are swapped out, instead of clearing the pte entry, we leave a marker pte showing that this page is swapped out as a hint for pagemap. A new TTU flag is introduced to identify this case. This can be useful for detecting swapped out cold shmem pages. Then after some memory background scanning work (which will fault in the shmem page and confusing page reclaim), we can do MADV_PAGEOUT explicitly on this page to swap it out again as we know it was cold. For pagemap, we don't need to explicitly set PM_SWAP bit, because by nature SWP_PTE_MARKER ptes are already counted as PM_SWAP due to it's format as swap. Signed-off-by: Peter Xu --- fs/proc/task_mmu.c | 1 + include/linux/rmap.h | 1 + mm/rmap.c | 19 +++++++++++++++++++ mm/vmscan.c | 2 +- 4 files changed, 22 insertions(+), 1 deletion(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index eb97468dfe4c..21b8594abc1d 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1384,6 +1384,7 @@ static pagemap_entry_t pte_to_pagemap_entry(struct pagemapread *pm, if (pm->show_pfn) frame = swp_type(entry) | (swp_offset(entry) << MAX_SWAPFILES_SHIFT); + /* NOTE: this covers PTE_MARKER_PAGEOUT too */ flags |= PM_SWAP; if (is_pfn_swap_entry(entry)) page = pfn_swap_entry_to_page(entry); diff --git a/include/linux/rmap.h b/include/linux/rmap.h index c976cc6de257..318a0e95c7fb 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -95,6 +95,7 @@ enum ttu_flags { * do a final flush if necessary */ TTU_RMAP_LOCKED = 0x80, /* do not grab rmap lock: * caller holds it */ + TTU_HINT_PAGEOUT = 0x100, /* Hint for pageout operation */ }; #ifdef CONFIG_MMU diff --git a/mm/rmap.c b/mm/rmap.c index b9eb5c12f3fe..24a70b36b6da 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1384,6 +1384,22 @@ void page_remove_rmap(struct page *page, bool compound) unlock_page_memcg(page); } +static inline void +pte_marker_install(struct vm_area_struct *vma, pte_t *pte, + struct page *page, unsigned long address) +{ +#ifdef CONFIG_PTE_MARKER_PAGEOUT + swp_entry_t entry; + pte_t pteval; + + if (vma_is_shmem(vma) && !PageAnon(page) && pte_none(*pte)) { + entry = make_pte_marker_entry(PTE_MARKER_PAGEOUT); + pteval = swp_entry_to_pte(entry); + set_pte_at(vma->vm_mm, address, pte, pteval); + } +#endif +} + /* * @arg: enum ttu_flags will be passed to this argument */ @@ -1628,6 +1644,9 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, */ dec_mm_counter(mm, mm_counter_file(page)); } + + if (flags & TTU_HINT_PAGEOUT) + pte_marker_install(vma, pvmw.pte, page, address); discard: /* * No need to call mmu_notifier_invalidate_range() it has be diff --git a/mm/vmscan.c b/mm/vmscan.c index 4620df62f0ff..4754af6fa24b 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1493,7 +1493,7 @@ static unsigned int shrink_page_list(struct list_head *page_list, * processes. Try to unmap it here. */ if (page_mapped(page)) { - enum ttu_flags flags = TTU_BATCH_FLUSH; + enum ttu_flags flags = TTU_BATCH_FLUSH | TTU_HINT_PAGEOUT; bool was_swapbacked = PageSwapBacked(page); if (unlikely(PageTransHuge(page)))