From patchwork Sun Jan 16 12:18:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 12714539 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C617C433FE for ; Sun, 16 Jan 2022 12:18:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235175AbiAPMSl (ORCPT ); Sun, 16 Jan 2022 07:18:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58186 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235109AbiAPMSd (ORCPT ); Sun, 16 Jan 2022 07:18:33 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 69CE3C06173F; Sun, 16 Jan 2022 04:18:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=vTljgvHhqZG3cT2RqoyXMcY1zZutR0LRzeE9Ah9ayuk=; b=qpCpEjsgcgij2zVHnH1WhH1Opg 5COj5QM3fLjZ62EFFu7Fz/3O7RcjVnqvhubuAolQcCVWIDlTTZM607nQaE76CajV2ILjyMdhUuFbF B07DdDlNrjU2AsmoMiUmk9oyM/iGyjXS6Xnc/K4fZkNydVYJzhEzZQB0oJJSVO5jIQjM1dpQPDhwo YlucFFuseK5sDqCqkpvZwpx3iCoMB3d57MvCEnX/2Q8WAg44whIg/wjHW3RWZ4oucF9LXwInqgpnv tyJ2fCzGXjiwFbEP3NbQo9A7FOS8eZNS1TlfQXX/bDWUJkIU5rfOJDjlxylWXtUfDeSjtb5OI3GE/ wrK5XMnw==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1n94UM-007FUB-6X; Sun, 16 Jan 2022 12:18:26 +0000 From: "Matthew Wilcox (Oracle)" To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , Christoph Hellwig , John Hubbard , Jason Gunthorpe , William Kucharski Subject: [PATCH 01/12] mm: Add folio_put_refs() Date: Sun, 16 Jan 2022 12:18:11 +0000 Message-Id: <20220116121822.1727633-2-willy@infradead.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220116121822.1727633-1-willy@infradead.org> References: <20220116121822.1727633-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This is like folio_put(), but puts N references at once instead of just one. It's like put_page_refs(), but does one atomic operation instead of two, and is available to more than just gup.c. Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Christoph Hellwig Reviewed-by: John Hubbard Reviewed-by: Jason Gunthorpe Reviewed-by: William Kucharski --- include/linux/mm.h | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index c768a7c81b0b..cb98f75b245e 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1244,6 +1244,26 @@ static inline void folio_put(struct folio *folio) __put_page(&folio->page); } +/** + * folio_put_refs - Reduce the reference count on a folio. + * @folio: The folio. + * @refs: The amount to subtract from the folio's reference count. + * + * If the folio's reference count reaches zero, the memory will be + * released back to the page allocator and may be used by another + * allocation immediately. Do not access the memory or the struct folio + * after calling folio_put_refs() unless you can be sure that these weren't + * the last references. + * + * Context: May be called in process or interrupt context, but not in NMI + * context. May be called while holding a spinlock. + */ +static inline void folio_put_refs(struct folio *folio, int refs) +{ + if (folio_ref_sub_and_test(folio, refs)) + __put_page(&folio->page); +} + static inline void put_page(struct page *page) { struct folio *folio = page_folio(page); From patchwork Sun Jan 16 12:18:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 12714530 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5EF5CC433EF for ; Sun, 16 Jan 2022 12:18:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235066AbiAPMS2 (ORCPT ); Sun, 16 Jan 2022 07:18:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58150 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235038AbiAPMS2 (ORCPT ); Sun, 16 Jan 2022 07:18:28 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1D959C061574; Sun, 16 Jan 2022 04:18:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=3MIQy1Z2cREUy+AYirD8/hbvJvQAYl4W5xf1lrR8/fc=; b=Z4YcmDRUqYP1PmRMV4Cq/T/YCL 44F6kIQbZPnImDlMxhiRXRfXke9wm9hnu+ycUToDuWrmdiSzq6DrycKO9fRyKn23vfwRPie7hKmFT t8G5pmM1qEGBc6n4yYjxnyEKwjKDLSHpegALqtoNUVdK5pkCQ44E+ynfeBm2El5JuqmJBLHATpwfv HOxCuIHnwdqiqlmHlWZpxi0c8XjlhXMvvaztXsb9FuPOTGBU2HA1ABEV+mQv2vARDRj37RSDE/dLL rIwMojoZJ22p76EGYNRn2w9ts+hFFa0LgSsr3i/aXjS4a7g2g7aMP0TLL3ju09B7URdCnvX/aGM2G jx0e5u7w==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1n94UM-007FUD-8P; Sun, 16 Jan 2022 12:18:26 +0000 From: "Matthew Wilcox (Oracle)" To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 02/12] filemap: Use folio_put_refs() in filemap_free_folio() Date: Sun, 16 Jan 2022 12:18:12 +0000 Message-Id: <20220116121822.1727633-3-willy@infradead.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220116121822.1727633-1-willy@infradead.org> References: <20220116121822.1727633-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This shrinks filemap_free_folio() by 55 bytes in my .config; 24 bytes from removing the VM_BUG_ON_FOLIO() and 31 bytes from unifying the small/large folio paths. We could just use folio_ref_sub() here since the caller should hold a reference (as the VM_BUG_ON_FOLIO() was asserting), but that's fragile. Signed-off-by: Matthew Wilcox (Oracle) --- mm/filemap.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 2fd9b2f24025..afc8f5ca85ac 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -231,17 +231,15 @@ void __filemap_remove_folio(struct folio *folio, void *shadow) void filemap_free_folio(struct address_space *mapping, struct folio *folio) { void (*freepage)(struct page *); + int refs = 1; freepage = mapping->a_ops->freepage; if (freepage) freepage(&folio->page); - if (folio_test_large(folio) && !folio_test_hugetlb(folio)) { - folio_ref_sub(folio, folio_nr_pages(folio)); - VM_BUG_ON_FOLIO(folio_ref_count(folio) <= 0, folio); - } else { - folio_put(folio); - } + if (folio_test_large(folio) && !folio_test_hugetlb(folio)) + refs = folio_nr_pages(folio); + folio_put_refs(folio, refs); } /** From patchwork Sun Jan 16 12:18:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 12714542 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65AE8C433FE for ; Sun, 16 Jan 2022 12:18:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235226AbiAPMSr (ORCPT ); Sun, 16 Jan 2022 07:18:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58186 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235038AbiAPMSi (ORCPT ); Sun, 16 Jan 2022 07:18:38 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2474FC061748; Sun, 16 Jan 2022 04:18:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=xWU6598e4aJ6j8LBEVFg/+CSnDbanOj2pq/9yym6seM=; b=rbCpPuUDrtplvD2Re5I4f0vZpA U64QaNgvTkA4w7eHgfz8wtBAHipbF1+rombZMLPZ0A0OpcrMIhbjhJi+/wqXOjtE4HCQXb0Tp1PAt s0cihpaRBiw3iAZ5b/GDNR2di8Wo8QfKwG9AkbJKJJyiSJiMXUp0OIs5VuZhEhaz6KyIDcknXkq2X Md/yv9peIyIAKv4n4m9E5wGIjb6KtQiQuqYWD7FoW92q6OLgQGm9bF2XQTi22bSyjQFIatQbOwK4/ yvtahBz8MDpJjdQMLu/6U409X+RBe4eEwsc+5gy/UVgoJE1WxLOIhiV+TnmZeEKOhnj1+4oxpQUNM 0/FFx7JQ==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1n94UM-007FUF-Af; Sun, 16 Jan 2022 12:18:26 +0000 From: "Matthew Wilcox (Oracle)" To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 03/12] filemap: Allow large folios to be added to the page cache Date: Sun, 16 Jan 2022 12:18:13 +0000 Message-Id: <20220116121822.1727633-4-willy@infradead.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220116121822.1727633-1-willy@infradead.org> References: <20220116121822.1727633-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org We return -EEXIST if there are any non-shadow entries in the page cache in the range covered by the folio. If there are multiple shadow entries in the range, we set *shadowp to one of them (currently the one at the highest index). If that turns out to be the wrong answer, we can implement something more complex. This is mostly modelled after the equivalent function in the shmem code. Signed-off-by: Matthew Wilcox (Oracle) --- mm/filemap.c | 39 ++++++++++++++++++++++----------------- 1 file changed, 22 insertions(+), 17 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index afc8f5ca85ac..fe079b676ab7 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -851,26 +851,27 @@ noinline int __filemap_add_folio(struct address_space *mapping, { XA_STATE(xas, &mapping->i_pages, index); int huge = folio_test_hugetlb(folio); - int error; bool charged = false; + long nr = 1; VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); VM_BUG_ON_FOLIO(folio_test_swapbacked(folio), folio); mapping_set_update(&xas, mapping); - folio_get(folio); - folio->mapping = mapping; - folio->index = index; - if (!huge) { - error = mem_cgroup_charge(folio, NULL, gfp); + int error = mem_cgroup_charge(folio, NULL, gfp); VM_BUG_ON_FOLIO(index & (folio_nr_pages(folio) - 1), folio); if (error) - goto error; + return error; charged = true; + xas_set_order(&xas, index, folio_order(folio)); + nr = folio_nr_pages(folio); } gfp &= GFP_RECLAIM_MASK; + folio_ref_add(folio, nr); + folio->mapping = mapping; + folio->index = xas.xa_index; do { unsigned int order = xa_get_order(xas.xa, xas.xa_index); @@ -894,6 +895,8 @@ noinline int __filemap_add_folio(struct address_space *mapping, /* entry may have been split before we acquired lock */ order = xa_get_order(xas.xa, xas.xa_index); if (order > folio_order(folio)) { + /* How to handle large swap entries? */ + BUG_ON(shmem_mapping(mapping)); xas_split(&xas, old, order); xas_reset(&xas); } @@ -903,29 +906,31 @@ noinline int __filemap_add_folio(struct address_space *mapping, if (xas_error(&xas)) goto unlock; - mapping->nrpages++; + mapping->nrpages += nr; /* hugetlb pages do not participate in page cache accounting */ - if (!huge) - __lruvec_stat_add_folio(folio, NR_FILE_PAGES); + if (!huge) { + __lruvec_stat_mod_folio(folio, NR_FILE_PAGES, nr); + if (folio_test_pmd_mappable(folio)) + __lruvec_stat_mod_folio(folio, + NR_FILE_THPS, nr); + } unlock: xas_unlock_irq(&xas); } while (xas_nomem(&xas, gfp)); - if (xas_error(&xas)) { - error = xas_error(&xas); - if (charged) - mem_cgroup_uncharge(folio); + if (xas_error(&xas)) goto error; - } trace_mm_filemap_add_to_page_cache(folio); return 0; error: + if (charged) + mem_cgroup_uncharge(folio); folio->mapping = NULL; /* Leave page->index set: truncation relies upon it */ - folio_put(folio); - return error; + folio_put_refs(folio, nr); + return xas_error(&xas); } ALLOW_ERROR_INJECTION(__filemap_add_folio, ERRNO); From patchwork Sun Jan 16 12:18:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 12714540 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82FA7C433EF for ; Sun, 16 Jan 2022 12:18:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235188AbiAPMSm (ORCPT ); Sun, 16 Jan 2022 07:18:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58194 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235112AbiAPMSe (ORCPT ); Sun, 16 Jan 2022 07:18:34 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 15F63C061574; Sun, 16 Jan 2022 04:18:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=DgDHeNwPS9XAK0JO3fVKJQ4TsoEJSTs7hmi2i7kzloU=; b=UsA9wMhthgEmyRAyyN0eOL4wHq BBnwv4WAChPolc/5fswDbTKaaDo66RdkzsTTI77OvtwhbWCU2bmvauPXmje3o0PKno99CGclqqmvw KrgLAxFpOgGk7Q7DncaNjS1uO5K5txNCvb+3rat+Ow9xSrtr9eKD/pSjxnVK9tYZe+siGUYINgB/I R6I1L1xj+LXUt+zzQsmJ3UaQp378toif1wubVCuxlr96YYkkbpHXNKOSGF6WiTT0bEB1OaAfIQaaY PGN3ikv1L1z9YjI6e6SGtsJcuWNy7IsQvfa43s8q9Z/3VQ+0y6YOmEDGip1WKpxhfIynTzLXqiVJ/ 5IIEHkHQ==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1n94UM-007FUH-D6; Sun, 16 Jan 2022 12:18:26 +0000 From: "Matthew Wilcox (Oracle)" To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 04/12] mm/vmscan: Free non-shmem folios without splitting them Date: Sun, 16 Jan 2022 12:18:14 +0000 Message-Id: <20220116121822.1727633-5-willy@infradead.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220116121822.1727633-1-willy@infradead.org> References: <20220116121822.1727633-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org We have to allocate memory in order to split a file-backed folio, so it's not a good idea to split them in the memory freeing path. It also doesn't work for XFS because pages have an extra reference count from page_has_private() and split_huge_page() expects that reference to have already been removed. Unfortunately, we still have to split shmem THPs because we can't handle swapping out an entire THP yet. Signed-off-by: Matthew Wilcox (Oracle) --- mm/vmscan.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 700434db5735..45665874082d 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1728,8 +1728,8 @@ static unsigned int shrink_page_list(struct list_head *page_list, /* Adding to swap updated mapping */ mapping = page_mapping(page); } - } else if (unlikely(PageTransHuge(page))) { - /* Split file THP */ + } else if (PageSwapBacked(page) && PageTransHuge(page)) { + /* Split shmem THP */ if (split_huge_page_to_list(page, page_list)) goto keep_locked; } From patchwork Sun Jan 16 12:18:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 12714541 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB47EC433EF for ; Sun, 16 Jan 2022 12:18:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235163AbiAPMSp (ORCPT ); Sun, 16 Jan 2022 07:18:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58194 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235171AbiAPMSk (ORCPT ); Sun, 16 Jan 2022 07:18:40 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 171D7C061574; Sun, 16 Jan 2022 04:18:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=mH8MdEUBlsPhf7ilWJH9nUxsrJCyVKcinjFNzwz2y+c=; b=HADWwBMt7yDSfmBsMV6lI8Kdp9 gBuwraxP0C+hBTdLIjfFIK3OLLD/DrTdQ8u0qr2PVGguULrYpoZWChFHZKCPTRqeSOeEdo2ZHl78i ln6sfaOTz/vKTy4odQlD+D1Cv7jnk716fdVa/oVFspyPHrEZUNO3ZMBWUL1F/RF9wwVTg9BKnFAK6 s/InMwuxOMbmtWo7c74ctnEu829l5uNC4HvoAabVtrd7xNbHTUoluLkZ/QfP5aG9fzOMABVwmMvwq I9JTaoUNHsQlLKBZzf15ab5vTu533gYiccgQMgqx35DA/D7xeRk8YGksLRKrjlaZqz80HyXf8kURD jOZLZwtg==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1n94UM-007FUJ-FF; Sun, 16 Jan 2022 12:18:26 +0000 From: "Matthew Wilcox (Oracle)" To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 05/12] mm: Fix READ_ONLY_THP warning Date: Sun, 16 Jan 2022 12:18:15 +0000 Message-Id: <20220116121822.1727633-6-willy@infradead.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220116121822.1727633-1-willy@infradead.org> References: <20220116121822.1727633-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org These counters only exist if CONFIG_READ_ONLY_THP_FOR_FS is defined, but we do not need to warn if the filesystem natively supports large folios. Signed-off-by: Matthew Wilcox (Oracle) --- include/linux/pagemap.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 270bf5136c34..877dabed0316 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -212,7 +212,7 @@ static inline void filemap_nr_thps_inc(struct address_space *mapping) if (!mapping_large_folio_support(mapping)) atomic_inc(&mapping->nr_thps); #else - WARN_ON_ONCE(1); + WARN_ON_ONCE(mapping_large_folio_support(mapping) == 0); #endif } @@ -222,7 +222,7 @@ static inline void filemap_nr_thps_dec(struct address_space *mapping) if (!mapping_large_folio_support(mapping)) atomic_dec(&mapping->nr_thps); #else - WARN_ON_ONCE(1); + WARN_ON_ONCE(mapping_large_folio_support(mapping) == 0); #endif } From patchwork Sun Jan 16 12:18:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 12714531 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02957C433FE for ; Sun, 16 Jan 2022 12:18:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235080AbiAPMS3 (ORCPT ); Sun, 16 Jan 2022 07:18:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58152 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235062AbiAPMS2 (ORCPT ); Sun, 16 Jan 2022 07:18:28 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3A860C06161C; Sun, 16 Jan 2022 04:18:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=nR4q6LlG4aGXJETtHyIwh712yfW3kU3d8of6ibolorw=; b=skxRut/WqIDWQi1x1bJxL7nfWV 4K6mG+tEfRQsHFAfjGW3fuPyDROkCHtjOfjP4uuuWO4XM5+QfRhizrFk7InpT7Sq27J+/yyk/52JX rR4Mqoh+iZpSzGel6T8UsGvic5F3Sp++fCGdjNDNFzpRmFoMx1f2lj53jsemOtjwcd7uzu5Po7hzH XyvIQRYQ2Zwjv6nULypWTl1m76xP3NtfqvalshW/TgMo3/RNflTq9jpSSe4hbC6H/txt9eS7p1MIJ vBr+m2tJcEwrwAfp46jfM+1jhLjyTag5Qj3xFH5Et8Mt53MXAj1ZwMi11KEtbNtZOInD+UAfrWOF6 6kkhnMkg==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1n94UM-007FUL-Hj; Sun, 16 Jan 2022 12:18:26 +0000 From: "Matthew Wilcox (Oracle)" To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 06/12] mm/vmscan: Optimise shrink_page_list for non-PMD-sized folios Date: Sun, 16 Jan 2022 12:18:16 +0000 Message-Id: <20220116121822.1727633-7-willy@infradead.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220116121822.1727633-1-willy@infradead.org> References: <20220116121822.1727633-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org A large folio which is smaller than a PMD does not need to do the extra work in try_to_unmap() of trying to split a PMD entry. Signed-off-by: Matthew Wilcox (Oracle) --- mm/vmscan.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 45665874082d..3181bf2f8a37 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1754,7 +1754,8 @@ static unsigned int shrink_page_list(struct list_head *page_list, enum ttu_flags flags = TTU_BATCH_FLUSH; bool was_swapbacked = PageSwapBacked(page); - if (unlikely(PageTransHuge(page))) + if (PageTransHuge(page) && + thp_order(page) >= HPAGE_PMD_ORDER) flags |= TTU_SPLIT_HUGE_PMD; try_to_unmap(page, flags); From patchwork Sun Jan 16 12:18:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 12714533 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BDC01C43219 for ; Sun, 16 Jan 2022 12:18:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235073AbiAPMS3 (ORCPT ); Sun, 16 Jan 2022 07:18:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58156 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235053AbiAPMS2 (ORCPT ); Sun, 16 Jan 2022 07:18:28 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 51DAFC06173E; Sun, 16 Jan 2022 04:18:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=x22IDWNKLOZQBYNT5ZnUP/ws3USwr++SKpDySW8y9SA=; b=HyvtLeCzLmcJvJgRUexW0kzX6z +CRkAaXfZ/sTESZXnMn5cpVO7vkJEmjCnKIAk9RLbndO0OCnmumRMxPCQOkB1h+hACL+eHJGozBpV YR5aEcdeM3fb+OMfI6dAAXefLBP+ZijUN+2RDaQodeu2/GSvNBpUFyE/8wVkcRrCN3spFVOwfaE5H FhgDkdNCnn4Gn5WzHWwSWrKrPM9IeQIENcEda5qztngwQTt9WBudC3M9mKx+/ZxC50QDup2/c+SXQ IODSc9BcXn9j2IDVgDVnrKAQqCPWOnMAufdlvb149RlGDdTxN2yB9IlfjR+VSYfduJaiteCUzzcM/ ZMKSG3cA==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1n94UM-007FUS-LE; Sun, 16 Jan 2022 12:18:26 +0000 From: "Matthew Wilcox (Oracle)" To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 07/12] mm: Make large folios depend on THP Date: Sun, 16 Jan 2022 12:18:17 +0000 Message-Id: <20220116121822.1727633-8-willy@infradead.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220116121822.1727633-1-willy@infradead.org> References: <20220116121822.1727633-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Some parts of the VM still depend on THP to handle large folios correctly. Until those are fixed, prevent creating large folios if THP are disabled. Signed-off-by: Matthew Wilcox (Oracle) --- include/linux/pagemap.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 877dabed0316..3e348e0a9e4e 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -192,9 +192,14 @@ static inline void mapping_set_large_folios(struct address_space *mapping) __set_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags); } +/* + * Large folio support currently depends on THP. These dependencies are + * being worked on but are not yet fixed. + */ static inline bool mapping_large_folio_support(struct address_space *mapping) { - return test_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags); + return IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) && + test_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags); } static inline int filemap_nr_thps(struct address_space *mapping) From patchwork Sun Jan 16 12:18:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 12714532 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0A1BC43217 for ; Sun, 16 Jan 2022 12:18:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235086AbiAPMSa (ORCPT ); Sun, 16 Jan 2022 07:18:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58158 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229610AbiAPMS2 (ORCPT ); Sun, 16 Jan 2022 07:18:28 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6C3BDC06173F; Sun, 16 Jan 2022 04:18:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=TcquNu2GjfQJyfxq8hgdegm77gZgKaMIuUwOTeONZRk=; b=HszW9kNohERIgEv2cm7ENMk1oV Xbjiigj7kShxOLo1vqoG1RPkgfTFlmz3w8SNet0zIq0tvauEUquW9U2SbLeP0BjHoXI4Bn1ADUxaq cewsQFZw+dNdHqm6G1JCOqArDPby2i3AAf08F8Md74VXQPBpKpOJrTvaJ1V7dgBWftqLZcVr2jEp5 nl3RIqDB07AmWoZZEa7Lux6WnPcEiRJwvv4Fg2XSa41cjJBW/F5JGKhvQy42/blVmA2ayy9qdSiHv W+xF/qYQ7Gbks4vPKZIkhz9h2ImDiv8PYJttODBvSin6fLNYsrfKTAwbd+hH7Ci64yW6nOfsW7hUL tMPT8bLg==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1n94UM-007FUb-OY; Sun, 16 Jan 2022 12:18:26 +0000 From: "Matthew Wilcox (Oracle)" To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 08/12] mm/readahead: Add large folio readahead Date: Sun, 16 Jan 2022 12:18:18 +0000 Message-Id: <20220116121822.1727633-9-willy@infradead.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220116121822.1727633-1-willy@infradead.org> References: <20220116121822.1727633-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Allocate large folios in the readahead code when the filesystem supports them and it seems worth doing. The heuristic for choosing which folio sizes will surely need some tuning, but this aggressive ramp-up has been good for testing. Signed-off-by: Matthew Wilcox (Oracle) --- mm/readahead.c | 106 +++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 99 insertions(+), 7 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index cf0dcf89eb69..5100eaf5b0ee 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -148,7 +148,7 @@ static void read_pages(struct readahead_control *rac, struct list_head *pages, blk_finish_plug(&plug); - BUG_ON(!list_empty(pages)); + BUG_ON(pages && !list_empty(pages)); BUG_ON(readahead_count(rac)); out: @@ -431,11 +431,103 @@ static int try_context_readahead(struct address_space *mapping, return 1; } +/* + * There are some parts of the kernel which assume that PMD entries + * are exactly HPAGE_PMD_ORDER. Those should be fixed, but until then, + * limit the maximum allocation order to PMD size. I'm not aware of any + * assumptions about maximum order if THP are disabled, but 8 seems like + * a good order (that's 1MB if you're using 4kB pages) + */ +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +#define MAX_PAGECACHE_ORDER HPAGE_PMD_ORDER +#else +#define MAX_PAGECACHE_ORDER 8 +#endif + +static inline int ra_alloc_folio(struct readahead_control *ractl, pgoff_t index, + pgoff_t mark, unsigned int order, gfp_t gfp) +{ + int err; + struct folio *folio = filemap_alloc_folio(gfp, order); + + if (!folio) + return -ENOMEM; + if (mark - index < (1UL << order)) + folio_set_readahead(folio); + err = filemap_add_folio(ractl->mapping, folio, index, gfp); + if (err) + folio_put(folio); + else + ractl->_nr_pages += 1UL << order; + return err; +} + +static void page_cache_ra_order(struct readahead_control *ractl, + struct file_ra_state *ra, unsigned int new_order) +{ + struct address_space *mapping = ractl->mapping; + pgoff_t index = readahead_index(ractl); + pgoff_t limit = (i_size_read(mapping->host) - 1) >> PAGE_SHIFT; + pgoff_t mark = index + ra->size - ra->async_size; + int err = 0; + gfp_t gfp = readahead_gfp_mask(mapping); + + if (!mapping_large_folio_support(mapping) || ra->size < 4) + goto fallback; + + limit = min(limit, index + ra->size - 1); + + if (new_order < MAX_PAGECACHE_ORDER) { + new_order += 2; + if (new_order > MAX_PAGECACHE_ORDER) + new_order = MAX_PAGECACHE_ORDER; + while ((1 << new_order) > ra->size) + new_order--; + } + + while (index <= limit) { + unsigned int order = new_order; + + /* Align with smaller pages if needed */ + if (index & ((1UL << order) - 1)) { + order = __ffs(index); + if (order == 1) + order = 0; + } + /* Don't allocate pages past EOF */ + while (index + (1UL << order) - 1 > limit) { + if (--order == 1) + order = 0; + } + err = ra_alloc_folio(ractl, index, mark, order, gfp); + if (err) + break; + index += 1UL << order; + } + + if (index > limit) { + ra->size += index - limit - 1; + ra->async_size += index - limit - 1; + } + + read_pages(ractl, NULL, false); + + /* + * If there were already pages in the page cache, then we may have + * left some gaps. Let the regular readahead code take care of this + * situation. + */ + if (!err) + return; +fallback: + do_page_cache_ra(ractl, ra->size, ra->async_size); +} + /* * A minimal readahead algorithm for trivial sequential/random reads. */ static void ondemand_readahead(struct readahead_control *ractl, - bool hit_readahead_marker, unsigned long req_size) + struct folio *folio, unsigned long req_size) { struct backing_dev_info *bdi = inode_to_bdi(ractl->mapping->host); struct file_ra_state *ra = ractl->ra; @@ -470,12 +562,12 @@ static void ondemand_readahead(struct readahead_control *ractl, } /* - * Hit a marked page without valid readahead state. + * Hit a marked folio without valid readahead state. * E.g. interleaved reads. * Query the pagecache for async_size, which normally equals to * readahead size. Ramp it up and use it as the new readahead size. */ - if (hit_readahead_marker) { + if (folio) { pgoff_t start; rcu_read_lock(); @@ -548,7 +640,7 @@ static void ondemand_readahead(struct readahead_control *ractl, } ractl->_index = ra->start; - do_page_cache_ra(ractl, ra->size, ra->async_size); + page_cache_ra_order(ractl, ra, folio ? folio_order(folio) : 0); } void page_cache_sync_ra(struct readahead_control *ractl, @@ -576,7 +668,7 @@ void page_cache_sync_ra(struct readahead_control *ractl, } /* do read-ahead */ - ondemand_readahead(ractl, false, req_count); + ondemand_readahead(ractl, NULL, req_count); } EXPORT_SYMBOL_GPL(page_cache_sync_ra); @@ -605,7 +697,7 @@ void page_cache_async_ra(struct readahead_control *ractl, return; /* do read-ahead */ - ondemand_readahead(ractl, true, req_count); + ondemand_readahead(ractl, folio, req_count); } EXPORT_SYMBOL_GPL(page_cache_async_ra); From patchwork Sun Jan 16 12:18:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 12714536 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F0DFC4332F for ; Sun, 16 Jan 2022 12:18:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235125AbiAPMSf (ORCPT ); Sun, 16 Jan 2022 07:18:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58172 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235089AbiAPMSb (ORCPT ); Sun, 16 Jan 2022 07:18:31 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E591AC061574; Sun, 16 Jan 2022 04:18:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=hGAUTyP3YKMieg4ukwDdxrEbr7Dqansun9ibaWHflPg=; b=HW9oxCyfqc71KCw4UsiP8admKH P+HlyOF17U8jSApjuJR0uuYs6F64te/Sm7bn23eCPLD4E0dyvrzI1/gVG48RSV9hgjWZ/8BFo/Q+S K/Zx4vNotJpvWiUhzJHBXsrLqIIiX+RtBh/Ja38sU7XSvFzK494p7w0TGL3FDgtE7MaQVJX3AOwdN KoJN671KIfzeswq2GOqK36nU5MUnI3MT6WFzlGgkwAklV0biF/fc37AI9x4x9n/em2J0Ul2hdbIzO GhLB8gSBucCj3Na56/oRSbP69i2BE9N/pkYQ5Ctka8VZXi37usF5yIHBVF8sURG50/SunS3iRcKmu aN97bUDA==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1n94UM-007FUh-SL; Sun, 16 Jan 2022 12:18:26 +0000 From: "Matthew Wilcox (Oracle)" To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: William Kucharski , Matthew Wilcox Subject: [PATCH 09/12] mm/readahead: Align file mappings for non-DAX Date: Sun, 16 Jan 2022 12:18:19 +0000 Message-Id: <20220116121822.1727633-10-willy@infradead.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220116121822.1727633-1-willy@infradead.org> References: <20220116121822.1727633-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: William Kucharski When we have the opportunity to use PMDs to map a file, we want to follow the same rules as DAX. Signed-off-by: William Kucharski Signed-off-by: Matthew Wilcox (Oracle) --- mm/huge_memory.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index f58524394dc1..28c29a0d854b 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -582,13 +582,10 @@ unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long ret; loff_t off = (loff_t)pgoff << PAGE_SHIFT; - if (!IS_DAX(filp->f_mapping->host) || !IS_ENABLED(CONFIG_FS_DAX_PMD)) - goto out; - ret = __thp_get_unmapped_area(filp, addr, len, off, flags, PMD_SIZE); if (ret) return ret; -out: + return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags); } EXPORT_SYMBOL_GPL(thp_get_unmapped_area); From patchwork Sun Jan 16 12:18:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 12714535 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73469C433FE for ; Sun, 16 Jan 2022 12:18:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235104AbiAPMSc (ORCPT ); Sun, 16 Jan 2022 07:18:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58160 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235079AbiAPMS3 (ORCPT ); Sun, 16 Jan 2022 07:18:29 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 98DE3C061574; Sun, 16 Jan 2022 04:18:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=FLQ8hTEx3kPzs6+4vpyjvLn3g8Qd/uBRVnCIZc5hTjc=; b=KBxjapCNRDOC2MCQv8cV1VL0HQ 47jOANY2g855xMDwaaYDRhp6D9v/f4OboicCyykTBHs6Qg3LFfvMKCUx4RIh5ri1N9eGOg2H1RwIJ /+SsKNIoTzmrcMECHMFkfduKDG4ZoFkIMX3q9jHUUDlbyuP0t+JE7YRs3CkwwgCqXkqo0b9QAzsHM p7wEW+iUeO9bFmJIZRpKHzvNMwDLpfDNZJS/oSqVblAOcKUQB839bqD8iII2ZAOCpuoB1liFU1wN1 B7tyFIirwZSZEEnJK185vxAWOZj/s/9TarkIpPFLjU5QkIjAA2vpakOu/WDroFOXc+duPI4X7WtLg stJhnfSA==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1n94UM-007FUn-W7; Sun, 16 Jan 2022 12:18:27 +0000 From: "Matthew Wilcox (Oracle)" To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 10/12] mm/readahead: Switch to page_cache_ra_order Date: Sun, 16 Jan 2022 12:18:20 +0000 Message-Id: <20220116121822.1727633-11-willy@infradead.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220116121822.1727633-1-willy@infradead.org> References: <20220116121822.1727633-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org do_page_cache_ra() was being exposed for the benefit of do_sync_mmap_readahead(). Switch it over to page_cache_ra_order() partly because it's a better interface but mostly for the benefit of the next patch. Signed-off-by: Matthew Wilcox (Oracle) --- mm/filemap.c | 2 +- mm/internal.h | 4 ++-- mm/readahead.c | 4 ++-- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index fe079b676ab7..8f076f0fd94f 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2947,7 +2947,7 @@ static struct file *do_sync_mmap_readahead(struct vm_fault *vmf) ra->size = ra->ra_pages; ra->async_size = ra->ra_pages / 4; ractl._index = ra->start; - do_page_cache_ra(&ractl, ra->size, ra->async_size); + page_cache_ra_order(&ractl, ra, 0); return fpin; } diff --git a/mm/internal.h b/mm/internal.h index 26af8a5a5be3..dbc15201a9d4 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -82,8 +82,8 @@ void unmap_page_range(struct mmu_gather *tlb, unsigned long addr, unsigned long end, struct zap_details *details); -void do_page_cache_ra(struct readahead_control *, unsigned long nr_to_read, - unsigned long lookahead_size); +void page_cache_ra_order(struct readahead_control *, struct file_ra_state *, + unsigned int order); void force_page_cache_ra(struct readahead_control *, unsigned long nr); static inline void force_page_cache_readahead(struct address_space *mapping, struct file *file, pgoff_t index, unsigned long nr_to_read) diff --git a/mm/readahead.c b/mm/readahead.c index 5100eaf5b0ee..a20391d6a71b 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -247,7 +247,7 @@ EXPORT_SYMBOL_GPL(page_cache_ra_unbounded); * behaviour which would occur if page allocations are causing VM writeback. * We really don't want to intermingle reads and writes like that. */ -void do_page_cache_ra(struct readahead_control *ractl, +static void do_page_cache_ra(struct readahead_control *ractl, unsigned long nr_to_read, unsigned long lookahead_size) { struct inode *inode = ractl->mapping->host; @@ -462,7 +462,7 @@ static inline int ra_alloc_folio(struct readahead_control *ractl, pgoff_t index, return err; } -static void page_cache_ra_order(struct readahead_control *ractl, +void page_cache_ra_order(struct readahead_control *ractl, struct file_ra_state *ra, unsigned int new_order) { struct address_space *mapping = ractl->mapping; From patchwork Sun Jan 16 12:18:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 12714537 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10870C433FE for ; Sun, 16 Jan 2022 12:18:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235115AbiAPMSe (ORCPT ); Sun, 16 Jan 2022 07:18:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58162 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235076AbiAPMS3 (ORCPT ); Sun, 16 Jan 2022 07:18:29 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B59F9C06161C; Sun, 16 Jan 2022 04:18:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=N9+b+qRj01tgXtfzT98WhdYz3W5Hnc+VP1/gS5QLC2A=; b=VHAVkzo/Ow57u7iZ+eMc5A5UVU oK9KEC5BirQhQqk0WgFmvGP0cW5qPgAI3u0rTZUbWIKy4GReRWXhYE6qAF5YGsnFyTnVtTU1CPQvW sMuigGT1W6w172hqhpOrtb8RP3JqhazxOsbc3RLche6LofYifVwENLX67RLinReTxXLBM7ikaks+h tq6Cfd/XYn/7dsOOjdJwynyLFIm51cdaVFnOTT7+MuEly0ckH8fJg23pMHYWyMPXLanbVBX0X/W9v N8aDt/KVCxjTCp3t3UPcuL49tk+Sa1+0Pmz2Ue6hgH6TOXhl4GC12ky4FCFgnWyq2IX9dKqZvkosR 0SMZC/YQ==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1n94UN-007FUp-2K; Sun, 16 Jan 2022 12:18:27 +0000 From: "Matthew Wilcox (Oracle)" To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 11/12] mm/filemap: Support VM_HUGEPAGE for file mappings Date: Sun, 16 Jan 2022 12:18:21 +0000 Message-Id: <20220116121822.1727633-12-willy@infradead.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220116121822.1727633-1-willy@infradead.org> References: <20220116121822.1727633-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org If the VM_HUGEPAGE flag is set, attempt to allocate PMD-sized folios during readahead, even if we have no history of readahead being successful. Signed-off-by: Matthew Wilcox (Oracle) --- mm/filemap.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/mm/filemap.c b/mm/filemap.c index 8f076f0fd94f..da190fc4e186 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2915,6 +2915,24 @@ static struct file *do_sync_mmap_readahead(struct vm_fault *vmf) struct file *fpin = NULL; unsigned int mmap_miss; +#ifdef CONFIG_TRANSPARENT_HUGEPAGE + /* Use the readahead code, even if readahead is disabled */ + if (vmf->vma->vm_flags & VM_HUGEPAGE) { + fpin = maybe_unlock_mmap_for_io(vmf, fpin); + ractl._index &= ~((unsigned long)HPAGE_PMD_NR - 1); + ra->size = HPAGE_PMD_NR; + /* + * Fetch two PMD folios, so we get the chance to actually + * readahead, unless we've been told not to. + */ + if (!(vmf->vma->vm_flags & VM_RAND_READ)) + ra->size *= 2; + ra->async_size = HPAGE_PMD_NR; + page_cache_ra_order(&ractl, ra, HPAGE_PMD_ORDER); + return fpin; + } +#endif + /* If we don't want any read-ahead, don't bother */ if (vmf->vma->vm_flags & VM_RAND_READ) return fpin; From patchwork Sun Jan 16 12:18:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 12714534 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90AA9C433F5 for ; Sun, 16 Jan 2022 12:18:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235097AbiAPMSb (ORCPT ); Sun, 16 Jan 2022 07:18:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58164 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235038AbiAPMS3 (ORCPT ); Sun, 16 Jan 2022 07:18:29 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CA7BAC06173E; Sun, 16 Jan 2022 04:18:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=n98nGVb9Cb+XvLGLcWCumBllUZqlAwFV4o2oMVvKDzc=; b=o2t+DeyBUbS3FevpABu9qYgtan rupYDPIC9rxuQOAHz+dZRxG8INRfiE7ODy5CXtaBeYJBCNNg3pVTAxPa4mpHTyByTf38ZmKRMscg2 fR5Nr5Mfh1eTKO6PbCjVgEAEhVno3UrZGhbLjsLQkMIH1kQdG6m9pA9NRt9ix/G1/MtOpwxZnrmqn 3AL6nVaDn1iqZZBHuFy9SEL8U3dzAec/FHw8YWsemfqETZDhmSqU3/6z8gGH7qs0FDVWMZQ5jHcmI GOaKc47uNoHsD34cT7Thya53pYEXEpJWuNse3/zAZBn/iWhOwEd6hgSkQnLt5nWaNgGkHjiMj6Mvn 1LDB0d3A==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1n94UN-007FUv-5q; Sun, 16 Jan 2022 12:18:27 +0000 From: "Matthew Wilcox (Oracle)" To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" Subject: [PATCH 12/12] selftests/vm/transhuge-stress: Support file-backed PMD folios Date: Sun, 16 Jan 2022 12:18:22 +0000 Message-Id: <20220116121822.1727633-13-willy@infradead.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220116121822.1727633-1-willy@infradead.org> References: <20220116121822.1727633-1-willy@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add a -f option to test PMD folios on files Signed-off-by: Matthew Wilcox (Oracle) --- tools/testing/selftests/vm/transhuge-stress.c | 35 +++++++++++++------ 1 file changed, 24 insertions(+), 11 deletions(-) diff --git a/tools/testing/selftests/vm/transhuge-stress.c b/tools/testing/selftests/vm/transhuge-stress.c index 5e4c036f6ad3..a03cb3fce1f6 100644 --- a/tools/testing/selftests/vm/transhuge-stress.c +++ b/tools/testing/selftests/vm/transhuge-stress.c @@ -26,15 +26,17 @@ #define PAGEMAP_PFN(ent) ((ent) & ((1ull << 55) - 1)) int pagemap_fd; +int backing_fd = -1; +int mmap_flags = MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE; +#define PROT_RW (PROT_READ | PROT_WRITE) int64_t allocate_transhuge(void *ptr) { uint64_t ent[2]; /* drop pmd */ - if (mmap(ptr, HPAGE_SIZE, PROT_READ | PROT_WRITE, - MAP_FIXED | MAP_ANONYMOUS | - MAP_NORESERVE | MAP_PRIVATE, -1, 0) != ptr) + if (mmap(ptr, HPAGE_SIZE, PROT_RW, MAP_FIXED | mmap_flags, + backing_fd, 0) != ptr) errx(2, "mmap transhuge"); if (madvise(ptr, HPAGE_SIZE, MADV_HUGEPAGE)) @@ -60,6 +62,8 @@ int main(int argc, char **argv) size_t ram, len; void *ptr, *p; struct timespec a, b; + int i = 0; + char *name = NULL; double s; uint8_t *map; size_t map_len; @@ -69,13 +73,23 @@ int main(int argc, char **argv) ram = SIZE_MAX / 4; else ram *= sysconf(_SC_PAGESIZE); + len = ram; + + while (++i < argc) { + if (!strcmp(argv[i], "-h")) + errx(1, "usage: %s [size in MiB]", argv[0]); + else if (!strcmp(argv[i], "-f")) + name = argv[++i]; + else + len = atoll(argv[i]) << 20; + } - if (argc == 1) - len = ram; - else if (!strcmp(argv[1], "-h")) - errx(1, "usage: %s [size in MiB]", argv[0]); - else - len = atoll(argv[1]) << 20; + if (name) { + backing_fd = open(name, O_RDWR); + if (backing_fd == -1) + errx(2, "open %s", name); + mmap_flags = MAP_SHARED; + } warnx("allocate %zd transhuge pages, using %zd MiB virtual memory" " and %zd MiB of ram", len >> HPAGE_SHIFT, len >> 20, @@ -86,8 +100,7 @@ int main(int argc, char **argv) err(2, "open pagemap"); len -= len % HPAGE_SIZE; - ptr = mmap(NULL, len + HPAGE_SIZE, PROT_READ | PROT_WRITE, - MAP_ANONYMOUS | MAP_NORESERVE | MAP_PRIVATE, -1, 0); + ptr = mmap(NULL, len + HPAGE_SIZE, PROT_RW, mmap_flags, backing_fd, 0); if (ptr == MAP_FAILED) err(2, "initial mmap"); ptr += HPAGE_SIZE - (uintptr_t)ptr % HPAGE_SIZE;