From patchwork Wed Nov 8 19:12:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 13450484 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E79FC4332F for ; Wed, 8 Nov 2023 19:12:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C576E6B0266; Wed, 8 Nov 2023 14:12:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C05EB6B0269; Wed, 8 Nov 2023 14:12:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AA7B16B026A; Wed, 8 Nov 2023 14:12:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 97E3E6B0266 for ; Wed, 8 Nov 2023 14:12:52 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id DD22240314 for ; Wed, 8 Nov 2023 19:12:51 +0000 (UTC) X-FDA: 81435734142.11.997C778 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf15.hostedemail.com (Postfix) with ESMTP id D98C5A0012 for ; Wed, 8 Nov 2023 19:12:49 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b="Gdf/6JOe"; spf=none (imf15.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699470770; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=GzhoUaTdExsNW5zsZ/1qNjJsRNln9vE+5P61eP8uwpU=; b=un/M+wN15av5H7eAUzdvHjCPf+DNhMSRgW2QywfCypJ+aZIX+90dMLHnKqCF4AgflZCo5C wyGnnhjNuojWfaGXWqdxIEnZShJE7toLs63Hi/phmrZmQgHpByZrHGJ3F1KUEGVGT3CPT3 cRItt3xp4/c88qeuWQDyx2vu26mM/uM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1699470770; a=rsa-sha256; cv=none; b=CqDXY91+qqP7C/fi/FU8jSVowlMh+gRGA4AqxdJafdMYfmLJO4Gcn9y3RJL1CiXLG7+Xs5 978JNsoEQlB6uAyjBLIunBbDWScuimcJZGNm9QFkaeDXzrgqcKrraHg2p/inb3o9GsOMzT zFN5ToYDWWsZ6zbzoY4/our16mMu9SA= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b="Gdf/6JOe"; spf=none (imf15.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:Message-ID: Subject:To:From:Date:Sender:Reply-To:Cc:Content-Transfer-Encoding:Content-ID: Content-Description:In-Reply-To:References; bh=GzhoUaTdExsNW5zsZ/1qNjJsRNln9vE+5P61eP8uwpU=; b=Gdf/6JOekZKOr5hnVJY8U/0eCZ NHXXcKd+YD0jX2+zZ0QUsrP3a/WNYN6HizZI9oTR6rk1jhaiy3G7pjRXdOisXjytrq780pN37NhUq 7oYhpbgEc6YXHwxICoBfdDfYg84FYjLUottufa+kU09BhztScI/j1coqSiHb1npjCNrcKT9jSTzFq jUDeJlZb9inRhUCU1ZPHNS76MsxJ5jU3vS58WF5xEQVk7JsU4+asUjSInG4a7Ef61B9pGEHS52kIg JcGSu55FdB3nRwNdjpvjQB3zOOTL6D8hNqH7iHBRgcOlSD6KZHJvuOQX+jEAi8C8p5ZrL5c0UxHbZ GN4STqEg==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1r0nyp-002if5-SY for linux-mm@kvack.org; Wed, 08 Nov 2023 19:12:47 +0000 Date: Wed, 8 Nov 2023 19:12:47 +0000 From: Matthew Wilcox To: linux-mm@kvack.org Subject: Shrinking struct page progress Message-ID: MIME-Version: 1.0 Content-Disposition: inline X-Rspamd-Queue-Id: D98C5A0012 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: douk11yotozq8cm3qiu3ukmwr6jedgrc X-HE-Tag: 1699470769-406480 X-HE-Meta: U2FsdGVkX18/CB7TxX0mA1e5DLjsrzygAEMq14ifYm3pqypBiHlETkeqs6CNcIXqhNprl1xlmLuIvd4o8FYaBouD1uDXKVpnWG7akYJfmjpUok73kCtPUjAcSfFYnDtDrXIh4J9LvkphIRcsuzXNQd6MOLkOw/B3kQRixdVF0AE/QD+h4TrMfDvxe2mt9Stv+TWnrRJFDtq36GGCSNbWmWdTk7znjdk2xbd/ksJ4/rS7ojdCBEQ8RXlmsCs4Bh0c4E40b5scffjRYmIbEFJ4qebu+UqxYpaTefaMeP014XHcxwwEcZgFcFyoWAltWMZKTOMZNVeNZCqYE3N6HECmjpCEGYP5Y8LkTCmona9ks4VywRib3mFsMUvqNWZ+9PFyNS4tsGdUL7Q/BIqokwpbSAeYcV1g5GGlBNvMmOlar9E+2DXLh6PeB92PpMGcRXs0s3vMaVFBVUZYn+dpcP9od28yI0J6bkfR0lGVGXbjxk/2fiun0ZYfipLSu+PRD+x/ChmEGpWvcTCnHA0KJzKMfKVrt2Yi0YS7YtgLSwHJJv0DoT7h10DCHPdRYpufPU2zJ2er8y9EUrZSJsKIojgVdSt9hxKQcPkNMb6IdCJm0yv5KmykDlFKTn7ZjmAhqrOSzIyoS3XvGnlvvmNW5XNnxr2fQtGkSw7RLvDfntEfPtHv5+40WaO7dH8RAJ/mARYWScMb/ffEEDcamM8UO2r9JOl+fnXJ1Fqfl/SLj+rOm43n3qmBeDNKDSstb0VSrClu353qc4XVqCsKH8UfDWhWmONDEsfIBMLpSudNOBGj1n353GEuhsMbaMTbLdrJFkumQzLmRtnBCkcweAHIz4CmBGMo3zhElj/sJVJhl6CxWcsAnNsfjbBfnK14PCVvird2Jd+EdA0daS6RuBjY5craywoBUj5+8nixuzuQafdCL6rS1ay4VRHrYXU+l6GVikGBwC0DoBu3iFtav/KK3L1 S/vXeobM two9Fq3wqACOJ+W/L0Lzm1Og4WKYTqvEYeQh/HPP1Yaw9gj39AVSx39egZwaqgELcNwoRC0semvYyuSkC/Oan2sES8AvBB7diaBL7KT6mHYEbAKzCoD4oTz7ks1sqHD3KBcqCssDsgsNLLjhKFg8cI342zowo8K+EI5/ufA9yhRq1fL5qkLm1IRrKrtGCvHnuhNlBsp/h8KBSy1Y/3cbWEvZNK7Ef0sQ9cQPf/DCiUALG0+CM1wVH52zfI8OQCnlozPPH X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: At the bottom of this email is a patch which is all it currently takes to make an allnoconfig build without page->index and page->mapping. Assuming you first apply https://lore.kernel.org/linux-mm/20231108182809.602073-1-willy@infradead.org/ diffstat: 14 files changed, 27 insertions(+), 120 deletions(-) There are a few "cheats" in it. The obvious one is, of course, allnoconfig. That lets me handwave over every filesystem. Progress is being made in that area, but there's a lot left to do. Similarly, using allnoconfig lets me delete some of the helper functions which still use page->index and page->mapping like page_index(). That one's down to just ten callers, so we are making progress (we've removed nine in the last year). So, what do I think this shows? First, there's something odd going on with perf_mmap_fault(). Why does it need to set ->mapping and ->index? Second, percpu needs to stop abusing page->index ;-) Third, it's getting to be time to turn the page allocator into a folio allocator. I'll have some patches to move us in that direction soon. diff --git a/include/linux/migrate.h b/include/linux/migrate.h index 2ce13e8a309b..2e8af7bd483d 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -138,7 +138,7 @@ const struct movable_operations *page_movable_ops(struct page *page) VM_BUG_ON(!__PageMovable(page)); return (const struct movable_operations *) - ((unsigned long)page->mapping - PAGE_MAPPING_MOVABLE); + ((unsigned long)page->___mapping - PAGE_MAPPING_MOVABLE); } #ifdef CONFIG_NUMA_BALANCING diff --git a/include/linux/mm.h b/include/linux/mm.h index 418d26608ece..36fcd4421fb9 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2216,17 +2216,6 @@ static inline void *folio_address(const struct folio *folio) extern pgoff_t __page_file_index(struct page *page); -/* - * Return the pagecache index of the passed page. Regular pagecache pages - * use ->index whereas swapcache pages use swp_offset(->private) - */ -static inline pgoff_t page_index(struct page *page) -{ - if (unlikely(PageSwapCache(page))) - return __page_file_index(page); - return page->index; -} - /* * Return true only if the page has been allocated with * ALLOC_NO_WATERMARKS and the low watermark was not diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 957ce38768b2..bc490ba9362f 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -103,9 +103,9 @@ struct page { struct list_head pcp_list; }; /* See page-flags.h for PAGE_MAPPING_FLAGS */ - struct address_space *mapping; + struct address_space *___mapping; union { - pgoff_t index; /* Our offset within mapping. */ + pgoff_t ___index; /* Our offset within mapping. */ unsigned long share; /* share count for fsdax */ }; /** @@ -361,9 +361,9 @@ struct folio { static_assert(offsetof(struct page, pg) == offsetof(struct folio, fl)) FOLIO_MATCH(flags, flags); FOLIO_MATCH(lru, lru); -FOLIO_MATCH(mapping, mapping); +FOLIO_MATCH(___mapping, mapping); FOLIO_MATCH(compound_head, lru); -FOLIO_MATCH(index, index); +FOLIO_MATCH(___index, index); FOLIO_MATCH(private, private); FOLIO_MATCH(_mapcount, _mapcount); FOLIO_MATCH(_refcount, _refcount); @@ -449,7 +449,7 @@ struct ptdesc { TABLE_MATCH(flags, __page_flags); TABLE_MATCH(compound_head, pt_list); TABLE_MATCH(compound_head, _pt_pad_1); -TABLE_MATCH(mapping, __page_mapping); +TABLE_MATCH(___mapping, __page_mapping); TABLE_MATCH(rcu_head, pt_rcu_head); TABLE_MATCH(page_type, __page_type); TABLE_MATCH(_refcount, _refcount); diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index a88e64acebfe..a7dbc86cb6af 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -642,11 +642,6 @@ static __always_inline bool folio_mapping_flags(struct folio *folio) return ((unsigned long)folio->mapping & PAGE_MAPPING_FLAGS) != 0; } -static __always_inline int PageMappingFlags(struct page *page) -{ - return ((unsigned long)page->mapping & PAGE_MAPPING_FLAGS) != 0; -} - static __always_inline bool folio_test_anon(struct folio *folio) { return ((unsigned long)folio->mapping & PAGE_MAPPING_ANON) != 0; @@ -665,7 +660,7 @@ static __always_inline bool __folio_test_movable(const struct folio *folio) static __always_inline int __PageMovable(struct page *page) { - return ((unsigned long)page->mapping & PAGE_MAPPING_FLAGS) == + return ((unsigned long)page->___mapping & PAGE_MAPPING_FLAGS) == PAGE_MAPPING_MOVABLE; } diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index bcc1ea44b4e8..34bddb7204c9 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -865,30 +865,10 @@ static inline struct folio *read_mapping_folio(struct address_space *mapping, */ static inline pgoff_t page_to_pgoff(struct page *page) { - struct page *head; - - if (likely(!PageTransTail(page))) - return page->index; - - head = compound_head(page); - /* - * We don't initialize ->index for tail pages: calculate based on - * head page - */ - return head->index + page - head; -} - -/* - * Return byte-offset into filesystem object for page. - */ -static inline loff_t page_offset(struct page *page) -{ - return ((loff_t)page->index) << PAGE_SHIFT; -} + struct folio *folio; -static inline loff_t page_file_offset(struct page *page) -{ - return ((loff_t)page_index(page)) << PAGE_SHIFT; + folio = page_folio(page); + return folio->index + folio_page_idx(folio, page); } /** @@ -897,7 +877,7 @@ static inline loff_t page_file_offset(struct page *page) */ static inline loff_t folio_pos(struct folio *folio) { - return page_offset(&folio->page); + return (loff_t)folio->index * PAGE_SIZE; } /** @@ -909,7 +889,7 @@ static inline loff_t folio_pos(struct folio *folio) */ static inline loff_t folio_file_pos(struct folio *folio) { - return page_file_offset(&folio->page); + return (loff_t)folio_index(folio) * PAGE_SIZE; } /* @@ -1464,34 +1444,6 @@ static inline ssize_t folio_mkwrite_check_truncate(struct folio *folio, return offset; } -/** - * page_mkwrite_check_truncate - check if page was truncated - * @page: the page to check - * @inode: the inode to check the page against - * - * Returns the number of bytes in the page up to EOF, - * or -EFAULT if the page was truncated. - */ -static inline int page_mkwrite_check_truncate(struct page *page, - struct inode *inode) -{ - loff_t size = i_size_read(inode); - pgoff_t index = size >> PAGE_SHIFT; - int offset = offset_in_page(size); - - if (page->mapping != inode->i_mapping) - return -EFAULT; - - /* page is wholly inside EOF */ - if (page->index < index) - return PAGE_SIZE; - /* page is wholly past EOF */ - if (page->index > index || !offset) - return -EFAULT; - /* page is partially inside EOF */ - return offset; -} - /** * i_blocks_per_folio - How many blocks fit in this folio. * @inode: The inode which contains the blocks. diff --git a/include/linux/poison.h b/include/linux/poison.h index 851a855d3868..0d99a5265fba 100644 --- a/include/linux/poison.h +++ b/include/linux/poison.h @@ -29,10 +29,6 @@ /********** mm/page_poison.c **********/ #define PAGE_POISON 0xaa -/********** mm/page_alloc.c ************/ - -#define TAIL_MAPPING ((void *) 0x400 + POISON_POINTER_DELTA) - /********** mm/slab.c **********/ /* * Magic nums for obj red zoning. diff --git a/kernel/events/core.c b/kernel/events/core.c index 683dc086ef10..13acb878da17 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -6147,8 +6147,6 @@ static vm_fault_t perf_mmap_fault(struct vm_fault *vmf) goto unlock; get_page(vmf->page); - vmf->page->mapping = vmf->vma->vm_file->f_mapping; - vmf->page->index = vmf->pgoff; ret = 0; unlock: diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c index e8d82c2f07d0..5bae0c228f3b 100644 --- a/kernel/events/ring_buffer.c +++ b/kernel/events/ring_buffer.c @@ -637,7 +637,6 @@ static void rb_free_aux_page(struct perf_buffer *rb, int idx) struct page *page = virt_to_page(rb->aux_pages[idx]); ClearPagePrivate(page); - page->mapping = NULL; __free_page(page); } @@ -808,7 +807,6 @@ static void perf_mmap_free_page(void *addr) { struct page *page = virt_to_page(addr); - page->mapping = NULL; __free_page(page); } diff --git a/mm/debug.c b/mm/debug.c index ee533a5ceb79..de9b1a53cbf9 100644 --- a/mm/debug.c +++ b/mm/debug.c @@ -75,7 +75,7 @@ static void __dump_page(struct page *page) * and potentially other situations. (See the page_mapping() * implementation for what's missing here.) */ - unsigned long tmp = (unsigned long)page->mapping; + unsigned long tmp = (unsigned long)page->___mapping; if (tmp & PAGE_MAPPING_ANON) mapping = NULL; diff --git a/mm/huge_memory.c b/mm/huge_memory.c index f31f02472396..da89280d10e4 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2478,11 +2478,8 @@ static void __split_huge_page_tail(struct folio *folio, int tail, (1L << PG_dirty) | LRU_GEN_MASK | LRU_REFS_MASK)); - /* ->mapping in first and second tail page is replaced by other uses */ - VM_BUG_ON_PAGE(tail > 2 && page_tail->mapping != TAIL_MAPPING, - page_tail); - page_tail->mapping = head->mapping; - page_tail->index = head->index + tail; + new_folio->mapping = folio->mapping; + new_folio->index = folio->index + tail; /* * page->private should not be set in tail pages. Fix up and warn once diff --git a/mm/internal.h b/mm/internal.h index 7e84ec0219b1..8d92d8f15793 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -438,7 +438,6 @@ static inline void prep_compound_tail(struct page *head, int tail_idx) { struct page *p = head + tail_idx; - p->mapping = TAIL_MAPPING; set_compound_head(p, head); set_page_private(p, 0); } diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 733732e7e0ba..8c75f703de0a 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -215,12 +215,12 @@ gfp_t gfp_allowed_mask __read_mostly = GFP_BOOT_MASK; */ static inline int get_pcppage_migratetype(struct page *page) { - return page->index; + return page->___index; } static inline void set_pcppage_migratetype(struct page *page, int migratetype) { - page->index = migratetype; + page->___index = migratetype; } #ifdef CONFIG_HUGETLB_PAGE_SIZE_VARIABLE @@ -911,7 +911,7 @@ static inline bool page_expected_state(struct page *page, if (unlikely(atomic_read(&page->_mapcount) != -1)) return false; - if (unlikely((unsigned long)page->mapping | + if (unlikely((unsigned long)page->___mapping | page_ref_count(page) | #ifdef CONFIG_MEMCG page->memcg_data | @@ -928,7 +928,7 @@ static const char *page_bad_reason(struct page *page, unsigned long flags) if (unlikely(atomic_read(&page->_mapcount) != -1)) bad_reason = "nonzero mapcount"; - if (unlikely(page->mapping != NULL)) + if (unlikely(page->___mapping != NULL)) bad_reason = "non-NULL mapping"; if (unlikely(page_ref_count(page) != 0)) bad_reason = "nonzero _refcount"; @@ -981,9 +981,7 @@ static int free_tail_page_prepare(struct page *head_page, struct page *page) ret = 0; goto out; } - switch (page - head_page) { - case 1: - /* the first tail page: these may be in place of ->mapping */ + if (page - head_page == 1) { if (unlikely(folio_entire_mapcount(folio))) { bad_page(page, "nonzero entire_mapcount"); goto out; @@ -996,19 +994,6 @@ static int free_tail_page_prepare(struct page *head_page, struct page *page) bad_page(page, "nonzero pincount"); goto out; } - break; - case 2: - /* - * the second tail page: ->mapping is - * deferred_list.next -- ignore value. - */ - break; - default: - if (page->mapping != TAIL_MAPPING) { - bad_page(page, "corrupted mapping in tail page"); - goto out; - } - break; } if (unlikely(!PageTail(page))) { bad_page(page, "PageTail not set"); @@ -1020,7 +1005,7 @@ static int free_tail_page_prepare(struct page *head_page, struct page *page) } ret = 0; out: - page->mapping = NULL; + page->___mapping = NULL; clear_compound_head(page); return ret; } @@ -1080,8 +1065,10 @@ static __always_inline bool free_pages_prepare(struct page *page, bool skip_kasan_poison = should_skip_kasan_poison(page, fpi_flags); bool init = want_init_on_free(); bool compound = PageCompound(page); + struct folio *folio; VM_BUG_ON_PAGE(PageTail(page), page); + folio = (struct folio *)page; trace_mm_page_free(page, order); kmsan_free_page(page, order); @@ -1121,8 +1108,8 @@ static __always_inline bool free_pages_prepare(struct page *page, (page + i)->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; } } - if (PageMappingFlags(page)) - page->mapping = NULL; + if (folio_mapping_flags(folio)) + folio->mapping = NULL; if (memcg_kmem_online() && PageMemcgKmem(page)) __memcg_kmem_uncharge_page(page, order); if (is_check_pages_enabled()) { diff --git a/mm/percpu.c b/mm/percpu.c index 7b97d31df767..01fb98220d6e 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -253,13 +253,13 @@ static int pcpu_chunk_slot(const struct pcpu_chunk *chunk) /* set the pointer to a chunk in a page struct */ static void pcpu_set_page_chunk(struct page *page, struct pcpu_chunk *pcpu) { - page->index = (unsigned long)pcpu; + page->___index = (unsigned long)pcpu; } /* obtain pointer to a chunk from a page struct */ static struct pcpu_chunk *pcpu_get_page_chunk(struct page *page) { - return (struct pcpu_chunk *)page->index; + return (struct pcpu_chunk *)page->___index; } static int __maybe_unused pcpu_page_idx(unsigned int cpu, int page_idx) diff --git a/tools/include/linux/poison.h b/tools/include/linux/poison.h index 2e6338ac5eed..76eca9f9a68e 100644 --- a/tools/include/linux/poison.h +++ b/tools/include/linux/poison.h @@ -38,10 +38,6 @@ /********** mm/page_poison.c **********/ #define PAGE_POISON 0xaa -/********** mm/page_alloc.c ************/ - -#define TAIL_MAPPING ((void *) 0x400 + POISON_POINTER_DELTA) - /********** mm/slab.c **********/ /* * Magic nums for obj red zoning.