From patchwork Wed Dec 6 20:44:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 13482242 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 038FCC10DCE for ; Wed, 6 Dec 2023 20:44:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 856D56B009A; Wed, 6 Dec 2023 15:44:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7E0706B009B; Wed, 6 Dec 2023 15:44:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 65A396B009C; Wed, 6 Dec 2023 15:44:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 4CCA36B009A for ; Wed, 6 Dec 2023 15:44:52 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 19F761C0798 for ; Wed, 6 Dec 2023 20:44:52 +0000 (UTC) X-FDA: 81537572424.30.FC1D197 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf13.hostedemail.com (Postfix) with ESMTP id C3DFD20013 for ; Wed, 6 Dec 2023 20:44:48 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=bgXELJ87; dmarc=none; spf=none (imf13.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701895490; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=rhWeIRptRNUgjmaD/wW4RujIHeTM8TT1pwDCRhhFjIs=; b=JxUc5ZSKPr4bDZ9t074uMiz/mb3t1tiljiVcYMZMvP61OcIFM9DzZLzDfxzGBjNzpaV0bl 3BWmDuxCHejo0DAbBa20D+wpZmg/7mSIxdJPqPwkQhcs/oVJ2dJbYtUKvKVhmHpFhREIGX PUe5UoDtBVB+V7Z6vrn1XThylQ6kR/k= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=bgXELJ87; dmarc=none; spf=none (imf13.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701895490; a=rsa-sha256; cv=none; b=3R+1kHKmLNooMVYVarzZqXMCGW0+s5o1uoqhkGNPvKuh8XTfLh2MZycnDjf1/lAVQ1Ikj9 QxKDchsuOBEPGV6v7m+EmV1e2T0wcQ1N8E6j3fWFhPqliCnC3H3NsmE1gvbBho+WVkVBc0 RQxgW8CfiMqkAOXoTiG9kiRZsNDsReQ= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To:Content-Type:Content-ID: Content-Description:In-Reply-To:References; bh=rhWeIRptRNUgjmaD/wW4RujIHeTM8TT1pwDCRhhFjIs=; b=bgXELJ87R3tP8ilGHXJDckrMyc um7q6N8PUmF7+nj38xs/U4N332ER8dXBJGjswS+mIChgL+rP4iCibhzI2q0FEbK0xu/jwMCDrtbra /LVKpOBFkKdya3/YhSODZ1JRdOv9ge3bi2FjsDUupFgnJlDRUSPisLCW7ZK7SQne3aPyNvXA8U+Nq cPLP62wTky1kQCUMXJfJR3PAdh71y4ajPM4HIe6SJEDg1W5xCIGtcUX8UrjpxWoayc6nR3xI6ANj+ vvusKz5/IZr4EdgA++fzjxVDID31xZJnl6zK24rDa5gqt+TbinRbOq/ea8i5VyOB2EEciPXLhbxkJ NZA+MTjg==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1rAylA-003EgZ-Ic; Wed, 06 Dec 2023 20:44:44 +0000 From: "Matthew Wilcox (Oracle)" To: Andrew Morton Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Hugh Dickins , Viacheslav Dubeyko , "Kirill A. Shutemov" , Luis Chamberlain , Hannes Reinecke Subject: [PATCH] mm: Support order-1 folios in the page cache Date: Wed, 6 Dec 2023 20:44:42 +0000 Message-Id: <20231206204442.771430-1-willy@infradead.org> X-Mailer: git-send-email 2.37.1 MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: C3DFD20013 X-Stat-Signature: jsn9151ks4a84rx6tw4jks1hpoop18qk X-HE-Tag: 1701895488-233634 X-HE-Meta: U2FsdGVkX1/2LWF6iwMdumHAztO+KlIjMQDJU8SRoM3932P0KDiEVPUop0mDbhYnYEsLTq98q8xZtClSXuEB5h5Lre2Be8cEdPSugmfWHHAlP8j4sEPDcaNETjrIvjqdnFABZ/WBlDPHdp2hRirsHFzzEe6KZHVVQlIj2bN5qQmTOow0Y6XeLz3QmglNzB0mIrZUHaZNn0AGAzgHEcfztt+FYTlcxzkdHz6wgiwH0wTGXofL3q813aA5oVQW3/p5O1nai75cHOMd/tK/2bXHrbPQFIgDjDq1T8TJwG6utq3MivqWe9k723DU29IQRZ6I5a+U+m/wb+Lelli1rPUMVxF6+ksZgLT1hZdHsg7e0GrRJcptR5NwtoXU5ZTd/tDBphP8zKFMGJGyMjNSkZuBHsDwiWzWUc30boptsjiztQS2Ie8+iktwAkC0A9T1StfURxMUgu0xxBmpSK7g372v5cQTnUTIgt6IOaUV5bGtpP4byX09Yv9INjOPW5xcQToppco0iRnX1DY5/aHZz71Y96l4cE0nXzoAxCkypURSpW9fsdvjoqGeyjG8nsl1FbIwyJUO5WPyuxLIb/BaundND0xhSsOc2ZIwRzEAbxeynsti2N1QnvqmkdX2VlltWWttDFCQR1FKvRzhV/grEsnSyDEr4uGDg0p94qlvFAKNDVfsrtW0QOmu3xXYewq6fSa4SrfiKqtx+7CQNJ1hbZUQojDHxPgZ1Sw6WLOfsCHy4sM0/vWsEeuNGKZBYWwgrBunURS7SM45HmBs9sYAaAzHRhQF5OeqelzDeEeueXCyJGC4zEpgRDqm3YgfcHECAINaTpHNw0+tBHdCiqw0JgdlU7HZdXkNzuG/oa4g62Z/3DNAWAo1R734q+F/pb58cYYZztP/ssxnFvMTm10dYELNUOq8JxvdFcsVZXtt/zhqhjZ8JaeUuptFSd9pNlLPsqDSzcSf1+vDZ56ShZn6Ny5 9MsTWd/u w87X5pPRW7EVQjMT/jb5Fxs7/erhEdu0L9FJAQjHE6beb0ZlzP73kNea5SAs9QO6gVrTDBuMZgnIGlu43SBbYheSIuoOhS6IffTmWCLW/yvVCi29eW3GyjMfEb+x9S3QW+DqLmOVE9HkRG+SVjWYgsEAfItdwb7Yjoh4DHAsZvaPu+4IQRiJz6wimA/ij6skcjMO1oYsVCGGqmzHfK6bnwlFB48CiORGjthu2Wohd/3CA/ehXmvBRvp1KyDb8ytebEVg9NCSTqvqqGMQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Folios of order 1 have no space to store the deferred list. This is not a problem for the page cache as file-backed folios are never placed on the deferred list. All we need to do is prevent the core MM from touching the deferred list for order 1 folios and remove the code which prevented us from allocating order 1 folios. Link: https://lore.kernel.org/linux-mm/90344ea7-4eec-47ee-5996-0c22f42d6a6a@google.com/ Signed-off-by: Matthew Wilcox (Oracle) --- include/linux/huge_mm.h | 7 +++++-- mm/filemap.c | 2 -- mm/huge_memory.c | 23 ++++++++++++++++++----- mm/internal.h | 4 +--- mm/readahead.c | 8 ++------ 5 files changed, 26 insertions(+), 18 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index fa0350b0812a..7b59ff685da3 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -140,7 +140,7 @@ bool hugepage_vma_check(struct vm_area_struct *vma, unsigned long vm_flags, unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags); -void folio_prep_large_rmappable(struct folio *folio); +struct folio *folio_prep_large_rmappable(struct folio *folio); bool can_split_folio(struct folio *folio, int *pextra_pins); int split_huge_page_to_list(struct page *page, struct list_head *list); static inline int split_huge_page(struct page *page) @@ -280,7 +280,10 @@ static inline bool hugepage_vma_check(struct vm_area_struct *vma, return false; } -static inline void folio_prep_large_rmappable(struct folio *folio) {} +static inline struct folio *folio_prep_large_rmappable(struct folio *folio) +{ + return folio; +} #define transparent_hugepage_flags 0UL diff --git a/mm/filemap.c b/mm/filemap.c index 32eedf3afd45..61321e920e30 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1911,8 +1911,6 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, gfp_t alloc_gfp = gfp; err = -ENOMEM; - if (order == 1) - order = 0; if (order > 0) alloc_gfp |= __GFP_NORETRY | __GFP_NOWARN; folio = filemap_alloc_folio(alloc_gfp, order); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 4f542444a91f..0df68a318922 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -610,11 +610,15 @@ struct deferred_split *get_deferred_split_queue(struct folio *folio) } #endif -void folio_prep_large_rmappable(struct folio *folio) +struct folio *folio_prep_large_rmappable(struct folio *folio) { - VM_BUG_ON_FOLIO(folio_order(folio) < 2, folio); - INIT_LIST_HEAD(&folio->_deferred_list); + if (!folio || !folio_test_large(folio)) + return folio; + if (folio_order(folio) > 1) + INIT_LIST_HEAD(&folio->_deferred_list); folio_set_large_rmappable(folio); + + return folio; } static inline bool is_transparent_hugepage(struct folio *folio) @@ -2760,7 +2764,8 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) /* Prevent deferred_split_scan() touching ->_refcount */ spin_lock(&ds_queue->split_queue_lock); if (folio_ref_freeze(folio, 1 + extra_pins)) { - if (!list_empty(&folio->_deferred_list)) { + if (folio_order(folio) > 1 && + !list_empty(&folio->_deferred_list)) { ds_queue->split_queue_len--; list_del(&folio->_deferred_list); } @@ -2811,6 +2816,9 @@ void folio_undo_large_rmappable(struct folio *folio) struct deferred_split *ds_queue; unsigned long flags; + if (folio_order(folio) <= 1) + return; + /* * At this point, there is no one trying to add the folio to * deferred_list. If folio is not in deferred_list, it's safe @@ -2836,7 +2844,12 @@ void deferred_split_folio(struct folio *folio) #endif unsigned long flags; - VM_BUG_ON_FOLIO(folio_order(folio) < 2, folio); + /* + * Order 1 folios have no space for a deferred list, but we also + * won't waste much memory by not adding them to the deferred list. + */ + if (folio_order(folio) <= 1) + return; /* * The try_to_unmap() in page reclaim path might reach here too, diff --git a/mm/internal.h b/mm/internal.h index b61034bd50f5..11a9021614dd 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -419,9 +419,7 @@ static inline struct folio *page_rmappable_folio(struct page *page) { struct folio *folio = (struct folio *)page; - if (folio && folio_order(folio) > 1) - folio_prep_large_rmappable(folio); - return folio; + return folio_prep_large_rmappable(folio); } static inline void prep_compound_head(struct page *page, unsigned int order) diff --git a/mm/readahead.c b/mm/readahead.c index 6925e6959fd3..48cca8e8de17 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -513,14 +513,10 @@ void page_cache_ra_order(struct readahead_control *ractl, /* Align with smaller pages if needed */ if (index & ((1UL << order) - 1)) { order = __ffs(index); - if (order == 1) - order = 0; } /* Don't allocate pages past EOF */ - while (index + (1UL << order) - 1 > limit) { - if (--order == 1) - order = 0; - } + while (index + (1UL << order) - 1 > limit) + --order; err = ra_alloc_folio(ractl, index, mark, order, gfp); if (err) break;