From patchwork Wed Mar 31 18:47:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 12175919 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D29BAC433ED for ; Wed, 31 Mar 2021 18:47:51 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6F62E606A5 for ; Wed, 31 Mar 2021 18:47:51 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6F62E606A5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0A26F6B0081; Wed, 31 Mar 2021 14:47:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 079756B0082; Wed, 31 Mar 2021 14:47:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EA99B6B0083; Wed, 31 Mar 2021 14:47:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0252.hostedemail.com [216.40.44.252]) by kanga.kvack.org (Postfix) with ESMTP id CE41A6B0081 for ; Wed, 31 Mar 2021 14:47:50 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 8F5FB824999B for ; Wed, 31 Mar 2021 18:47:50 +0000 (UTC) X-FDA: 77981053500.20.C0AE930 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf21.hostedemail.com (Postfix) with ESMTP id 96D01E0001AC for ; Wed, 31 Mar 2021 18:47:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=Ie4cQ4kdp9EyBykbrvlRvsDrsomhHooMgBYPcLk/a3c=; b=gZHJlNglaF/tDnX/IhRaI7RX+k tiTjsjUNiSVfKRSbvesnb8NCYWS36Rdq6M3Sp0+JbtbYRE5akkAjofwEjh9p2O6kdkuI9SM86kVm1 5ViSmFoVq6FVQ0yuwtIsolkfUWCpfEspG+6svmDV1UQEsDlP7U+9pGV1XGIRZVa1FQDkx40ES+Il6 Ven8SAXpFBW8xZhYU3M3bf9RWiRSZ3iXsuKO3T9u4t6YqJhq3XHyBwqaphjgQbVyzVKhE8byQXyYF NT+1uFbfzu42ETDvBxnMc30a9AQALu+qPChb2FnhBYdmiGtXE/2iYEoSSWmZJPNpbtVCNzkeYYzCo 0wvTRl6g==; Received: from willy by casper.infradead.org with local (Exim 4.94 #2 (Red Hat Linux)) id 1lRfsT-004z5g-CL; Wed, 31 Mar 2021 18:47:41 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org Cc: "Matthew Wilcox (Oracle)" , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-cachefs@redhat.com, linux-afs@lists.infradead.org Subject: [PATCH v6 01/27] mm: Introduce struct folio Date: Wed, 31 Mar 2021 19:47:02 +0100 Message-Id: <20210331184728.1188084-2-willy@infradead.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210331184728.1188084-1-willy@infradead.org> References: <20210331184728.1188084-1-willy@infradead.org> MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 96D01E0001AC X-Stat-Signature: 7585bu3yi1noyasqfsnoy9qcasfs8uyb Received-SPF: none (infradead.org>: No applicable sender policy available) receiver=imf21; identity=mailfrom; envelope-from=""; helo=casper.infradead.org; client-ip=90.155.50.34 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1617216469-721043 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: A struct folio is a new abstraction to replace the venerable struct page. A function which takes a struct folio argument declares that it will operate on the entire (possibly compound) page, not just PAGE_SIZE bytes. In return, the caller guarantees that the pointer it is passing does not point to a tail page. Signed-off-by: Matthew Wilcox (Oracle) --- include/linux/mm.h | 78 ++++++++++++++++++++++++++++++++++++++++ include/linux/mm_types.h | 65 +++++++++++++++++++++++++++++++++ mm/util.c | 19 ++++++++++ 3 files changed, 162 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 3e4dc6678eb2..761063e733bf 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -936,6 +936,20 @@ static inline unsigned int compound_order(struct page *page) return page[1].compound_order; } +/** + * folio_order - The allocation order of a folio. + * @folio: The folio. + * + * A folio is composed of 2^order pages. See get_order() for the definition + * of order. + * + * Return: The order of the folio. + */ +static inline unsigned int folio_order(struct folio *folio) +{ + return compound_order(&folio->page); +} + static inline bool hpage_pincount_available(struct page *page) { /* @@ -1581,6 +1595,69 @@ static inline void set_page_links(struct page *page, enum zone_type zone, #endif } +/** + * folio_nr_pages - The number of pages in the folio. + * @folio: The folio. + * + * Return: A number which is a power of two. + */ +static inline unsigned long folio_nr_pages(struct folio *folio) +{ + return compound_nr(&folio->page); +} + +/** + * folio_next - Move to the next physical folio. + * @folio: The folio we're currently operating on. + * + * If you have physically contiguous memory which may span more than + * one folio (eg a &struct bio_vec), use this function to move from one + * folio to the next. Do not use it if the memory is only virtually + * contiguous as the folios are almost certainly not adjacent to each + * other. This is the folio equivalent to writing ``page++``. + * + * Context: We assume that the folios are refcounted and/or locked at a + * higher level and do not adjust the reference counts. + * Return: The next struct folio. + */ +static inline struct folio *folio_next(struct folio *folio) +{ +#if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP) + return (struct folio *)nth_page(&folio->page, folio_nr_pages(folio)); +#else + return folio + folio_nr_pages(folio); +#endif +} + +/** + * folio_shift - The number of bits covered by this folio. + * @folio: The folio. + * + * A folio contains a number of bytes which is a power-of-two in size. + * This function tells you which power-of-two the folio is. + * + * Context: The caller should have a reference on the folio to prevent + * it from being split. It is not necessary for the folio to be locked. + * Return: The base-2 logarithm of the size of this folio. + */ +static inline unsigned int folio_shift(struct folio *folio) +{ + return PAGE_SHIFT + folio_order(folio); +} + +/** + * folio_size - The number of bytes in a folio. + * @folio: The folio. + * + * Context: The caller should have a reference on the folio to prevent + * it from being split. It is not necessary for the folio to be locked. + * Return: The number of bytes in this folio. + */ +static inline size_t folio_size(struct folio *folio) +{ + return PAGE_SIZE << folio_order(folio); +} + /* * Some inline functions in vmstat.h depend on page_zone() */ @@ -1685,6 +1762,7 @@ extern void pagefault_out_of_memory(void); #define offset_in_page(p) ((unsigned long)(p) & ~PAGE_MASK) #define offset_in_thp(page, p) ((unsigned long)(p) & (thp_size(page) - 1)) +#define offset_in_folio(folio, p) ((unsigned long)(p) & (folio_size(folio) - 1)) /* * Flags passed to show_mem() and show_free_areas() to suppress output in diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 6613b26a8894..a0c7894fad1d 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -224,6 +224,71 @@ struct page { #endif } _struct_page_alignment; +/** + * struct folio - Represents a contiguous set of bytes. + * @flags: Identical to the page flags. + * @lru: Least Recently Used list; tracks how recently this folio was used. + * @mapping: The file this page belongs to, or refers to the anon_vma for + * anonymous pages. + * @index: Offset within the file, in units of pages. For anonymous pages, + * this is the index from the beginning of the mmap. + * @private: Filesystem per-folio data (see attach_folio_private()). + * Used for swp_entry_t if FolioSwapCache(). + * @_mapcount: How many times this folio is mapped to userspace. Use + * folio_mapcount() to access it. + * @_refcount: Number of references to this folio. Use folio_ref_count() + * to read it. + * @memcg_data: Memory Control Group data. + * + * A folio is a physically, virtually and logically contiguous set + * of bytes. It is a power-of-two in size, and it is aligned to that + * same power-of-two. It is at least as large as %PAGE_SIZE. If it is + * in the page cache, it is at a file offset which is a multiple of that + * power-of-two. + */ +struct folio { + /* private: don't document the anon union */ + union { + struct { + /* public: */ + unsigned long flags; + struct list_head lru; + struct address_space *mapping; + pgoff_t index; + unsigned long private; + atomic_t _mapcount; + atomic_t _refcount; +#ifdef CONFIG_MEMCG + unsigned long memcg_data; +#endif + /* private: the union with struct page is transitional */ + }; + struct page page; + }; +}; + +/** + * page_folio - Converts from page to folio. + * @page: The page. + * + * Every page is part of a folio. This function cannot be called on a + * NULL pointer. + * + * Context: No reference, nor lock is required on @page. If the caller + * does not hold a reference, this call may race with a folio split, so + * it should re-check the folio still contains this page after gaining + * a reference on the folio. + * Return: The folio which contains this page. + */ +static inline struct folio *page_folio(struct page *page) +{ + unsigned long head = READ_ONCE(page->compound_head); + + if (unlikely(head & 1)) + return (struct folio *)(head - 1); + return (struct folio *)page; +} + static inline atomic_t *compound_mapcount_ptr(struct page *page) { return &page[1].compound_mapcount; diff --git a/mm/util.c b/mm/util.c index 0b6dd9d81da7..521a772f06eb 100644 --- a/mm/util.c +++ b/mm/util.c @@ -686,6 +686,25 @@ struct anon_vma *page_anon_vma(struct page *page) return __page_rmapping(page); } +static inline void folio_build_bug(void) +{ +#define FOLIO_MATCH(pg, fl) \ +BUILD_BUG_ON(offsetof(struct page, pg) != offsetof(struct folio, fl)); + + FOLIO_MATCH(flags, flags); + FOLIO_MATCH(lru, lru); + FOLIO_MATCH(mapping, mapping); + FOLIO_MATCH(index, index); + FOLIO_MATCH(private, private); + FOLIO_MATCH(_mapcount, _mapcount); + FOLIO_MATCH(_refcount, _refcount); +#ifdef CONFIG_MEMCG + FOLIO_MATCH(memcg_data, memcg_data); +#endif +#undef FOLIO_MATCH + BUILD_BUG_ON(sizeof(struct page) != sizeof(struct folio)); +} + struct address_space *page_mapping(struct page *page) { struct address_space *mapping;