[RFC,v12,26/26] mm/luf: implement luf debug feature

Message ID	20250220052027.58847-27-byungchul@sk.com (mailing list archive)
State	New
Headers	show Return-Path: <owner-linux-mm@kvack.org> From: Byungchul Park <byungchul@sk.com> To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: kernel_team@skhynix.com, akpm@linux-foundation.org, ying.huang@intel.com, vernhao@tencent.com, mgorman@techsingularity.net, hughd@google.com, willy@infradead.org, david@redhat.com, peterz@infradead.org, luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, rjgolo@gmail.com Subject: [RFC PATCH v12 26/26] mm/luf: implement luf debug feature Date: Thu, 20 Feb 2025 14:20:27 +0900 Message-Id: <20250220052027.58847-27-byungchul@sk.com> In-Reply-To: <20250220052027.58847-1-byungchul@sk.com> References: <20250220052027.58847-1-byungchul@sk.com> Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	LUF(Lazy Unmap Flush) reducing tlb numbers over 90% \| expand [RFC,v12,00/26] LUF(Lazy Unmap Flush) reducing tlb numbers over 90% [RFC,v12,01/26] x86/tlb: add APIs manipulating tlb batch's arch data [RFC,v12,02/26] arm64/tlbflush: add APIs manipulating tlb batch's arch data [RFC,v12,03/26] riscv/tlb: add APIs manipulating tlb batch's arch data [RFC,v12,04/26] x86/tlb, riscv/tlb, mm/rmap: separate arch_tlbbatch_clear() out of arch_tlbbatch_fl… [RFC,v12,05/26] mm/buddy: make room for a new variable, luf_key, in struct page [RFC,v12,06/26] mm: move should_skip_kasan_poison() to mm/internal.h [RFC,v12,07/26] mm: introduce luf_ugen to be used as a global timestamp [RFC,v12,08/26] mm: introduce luf_batch to be used as hash table to store luf meta data [RFC,v12,09/26] mm: introduce API to perform tlb shootdown on exit from page allocator [RFC,v12,10/26] mm: introduce APIs to check if the page allocation is tlb shootdownable [RFC,v12,11/26] mm: deliver luf_key to pcp or buddy on free after unmapping [RFC,v12,12/26] mm: delimit critical sections to take off pages from pcp or buddy alloctor [RFC,v12,13/26] mm: introduce pend_list in struct free_area to track luf'd pages [RFC,v12,14/26] mm/rmap: recognize read-only tlb entries during batched tlb flush [RFC,v12,15/26] fs, filemap: refactor to gather the scattered ->write_{begin,end}() calls [RFC,v12,16/26] mm: implement LUF(Lazy Unmap Flush) defering tlb flush when folios get unmapped [RFC,v12,17/26] x86/tlb, riscv/tlb, arm64/tlbflush, mm: remove cpus from tlb shootdown that already… [RFC,v12,18/26] mm/page_alloc: retry 3 times to take pcp pages on luf check failure [RFC,v12,19/26] mm: skip luf tlb flush for luf'd mm that already has been done [RFC,v12,20/26] mm, fs: skip tlb flushes for luf'd filemap that already has been done [RFC,v12,21/26] mm: perform luf tlb shootdown per zone in batched manner [RFC,v12,22/26] mm/page_alloc: not allow to tlb shootdown if !preemptable() && non_luf_pages_ok() [RFC,v12,23/26] mm: separate move/undo parts from migrate_pages_batch() [RFC,v12,24/26] mm/migrate: apply luf mechanism to unmapping during migration [RFC,v12,25/26] mm/vmscan: apply luf mechanism to unmapping during folio reclaim [RFC,v12,26/26] mm/luf: implement luf debug feature

diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h index ec5caeb3cf8ef..9451f3d22f229 100644 --- a/arch/riscv/include/asm/tlbflush.h +++ b/arch/riscv/include/asm/tlbflush.h @@ -69,6 +69,9 @@ bool arch_tlbbatch_check_done(struct arch_tlbflush_unmap_batch *batch, unsigned bool arch_tlbbatch_diet(struct arch_tlbflush_unmap_batch *batch, unsigned long ugen); void arch_tlbbatch_mark_ugen(struct arch_tlbflush_unmap_batch *batch, unsigned long ugen); void arch_mm_mark_ugen(struct mm_struct *mm, unsigned long ugen); +#ifdef CONFIG_LUF_DEBUG +extern void print_lufd_arch(void); +#endif static inline void arch_tlbbatch_clear(struct arch_tlbflush_unmap_batch *batch) { diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c index 93afb7a299003..de91bfe0426c2 100644 --- a/arch/riscv/mm/tlbflush.c +++ b/arch/riscv/mm/tlbflush.c @@ -216,6 +216,25 @@ static int __init luf_init_arch(void) } early_initcall(luf_init_arch); +#ifdef CONFIG_LUF_DEBUG +static DEFINE_SPINLOCK(luf_debug_lock); +#define lufd_lock(f) spin_lock_irqsave(&luf_debug_lock, (f)) +#define lufd_unlock(f) spin_unlock_irqrestore(&luf_debug_lock, (f)) + +void print_lufd_arch(void) +{ + int cpu; + + pr_cont("LUFD ARCH:"); + for_each_cpu(cpu, cpu_possible_mask) + pr_cont(" %lu", atomic_long_read(per_cpu_ptr(&ugen_done, cpu))); + pr_cont("\n"); +} +#else +#define lufd_lock(f) do { (void)(f); } while(0) +#define lufd_unlock(f) do { (void)(f); } while(0) +#endif + /* * batch will not be updated. */ @@ -223,17 +242,22 @@ bool arch_tlbbatch_check_done(struct arch_tlbflush_unmap_batch *batch, unsigned long ugen) { int cpu; + unsigned long flags; if (!ugen) goto out; + lufd_lock(flags); for_each_cpu(cpu, &batch->cpumask) { unsigned long done; done = atomic_long_read(per_cpu_ptr(&ugen_done, cpu)); - if (ugen_before(done, ugen)) + if (ugen_before(done, ugen)) { + lufd_unlock(flags); return false; + } } + lufd_unlock(flags); return true; out: return cpumask_empty(&batch->cpumask); @@ -243,10 +267,12 @@ bool arch_tlbbatch_diet(struct arch_tlbflush_unmap_batch *batch, unsigned long ugen) { int cpu; + unsigned long flags; if (!ugen) goto out; + lufd_lock(flags); for_each_cpu(cpu, &batch->cpumask) { unsigned long done; @@ -254,6 +280,7 @@ bool arch_tlbbatch_diet(struct arch_tlbflush_unmap_batch *batch, if (!ugen_before(done, ugen)) cpumask_clear_cpu(cpu, &batch->cpumask); } + lufd_unlock(flags); out: return cpumask_empty(&batch->cpumask); } @@ -262,10 +289,12 @@ void arch_tlbbatch_mark_ugen(struct arch_tlbflush_unmap_batch *batch, unsigned long ugen) { int cpu; + unsigned long flags; if (!ugen) return; + lufd_lock(flags); for_each_cpu(cpu, &batch->cpumask) { atomic_long_t *done = per_cpu_ptr(&ugen_done, cpu); unsigned long old = atomic_long_read(done); @@ -283,15 +312,18 @@ void arch_tlbbatch_mark_ugen(struct arch_tlbflush_unmap_batch *batch, */ atomic_long_cmpxchg(done, old, ugen); } + lufd_unlock(flags); } void arch_mm_mark_ugen(struct mm_struct *mm, unsigned long ugen) { int cpu; + unsigned long flags; if (!ugen) return; + lufd_lock(flags); for_each_cpu(cpu, mm_cpumask(mm)) { atomic_long_t *done = per_cpu_ptr(&ugen_done, cpu); unsigned long old = atomic_long_read(done); @@ -309,4 +341,5 @@ void arch_mm_mark_ugen(struct mm_struct *mm, unsigned long ugen) */ atomic_long_cmpxchg(done, old, ugen); } + lufd_unlock(flags); } diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 593f10aabd45a..414bcabb23b51 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -695,12 +695,22 @@ static inline pud_t pud_mkyoung(pud_t pud) return pud_set_flags(pud, _PAGE_ACCESSED); } +#ifdef CONFIG_LUF_DEBUG +pud_t pud_mkwrite(pud_t pud); +static inline pud_t __pud_mkwrite(pud_t pud) +{ + pud = pud_set_flags(pud, _PAGE_RW); + + return pud_clear_saveddirty(pud); +} +#else static inline pud_t pud_mkwrite(pud_t pud) { pud = pud_set_flags(pud, _PAGE_RW); return pud_clear_saveddirty(pud); } +#endif #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY static inline int pte_soft_dirty(pte_t pte) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 1fc5bacd72dff..2825f4befb272 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -297,6 +297,9 @@ extern bool arch_tlbbatch_check_done(struct arch_tlbflush_unmap_batch *batch, un extern bool arch_tlbbatch_diet(struct arch_tlbflush_unmap_batch *batch, unsigned long ugen); extern void arch_tlbbatch_mark_ugen(struct arch_tlbflush_unmap_batch *batch, unsigned long ugen); extern void arch_mm_mark_ugen(struct mm_struct *mm, unsigned long ugen); +#ifdef CONFIG_LUF_DEBUG +extern void print_lufd_arch(void); +#endif static inline void arch_tlbbatch_clear(struct arch_tlbflush_unmap_batch *batch) { diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 5745a354a241c..f72e4cfdb0a8d 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -901,6 +901,7 @@ int pmd_free_pte_page(pmd_t *pmd, unsigned long addr) pte_t pte_mkwrite(pte_t pte, struct vm_area_struct *vma) { + lufd_check_pages(pte_page(pte), 0); if (vma->vm_flags & VM_SHADOW_STACK) return pte_mkwrite_shstk(pte); @@ -911,6 +912,7 @@ pte_t pte_mkwrite(pte_t pte, struct vm_area_struct *vma) pmd_t pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) { + lufd_check_pages(pmd_page(pmd), PMD_ORDER); if (vma->vm_flags & VM_SHADOW_STACK) return pmd_mkwrite_shstk(pmd); @@ -919,6 +921,14 @@ pmd_t pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) return pmd_clear_saveddirty(pmd); } +#ifdef CONFIG_LUF_DEBUG +pud_t pud_mkwrite(pud_t pud) +{ + lufd_check_pages(pud_page(pud), PUD_ORDER); + return __pud_mkwrite(pud); +} +#endif + void arch_check_zapped_pte(struct vm_area_struct *vma, pte_t pte) { /* diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 975f58fa4b30f..e9ae0d8f73442 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1253,6 +1253,25 @@ static int __init luf_init_arch(void) } early_initcall(luf_init_arch); +#ifdef CONFIG_LUF_DEBUG +static DEFINE_SPINLOCK(luf_debug_lock); +#define lufd_lock(f) spin_lock_irqsave(&luf_debug_lock, (f)) +#define lufd_unlock(f) spin_unlock_irqrestore(&luf_debug_lock, (f)) + +void print_lufd_arch(void) +{ + int cpu; + + pr_cont("LUFD ARCH:"); + for_each_cpu(cpu, cpu_possible_mask) + pr_cont(" %lu", atomic_long_read(per_cpu_ptr(&ugen_done, cpu))); + pr_cont("\n"); +} +#else +#define lufd_lock(f) do { (void)(f); } while(0) +#define lufd_unlock(f) do { (void)(f); } while(0) +#endif + /* * batch will not be updated. */ @@ -1260,17 +1279,22 @@ bool arch_tlbbatch_check_done(struct arch_tlbflush_unmap_batch *batch, unsigned long ugen) { int cpu; + unsigned long flags; if (!ugen) goto out; + lufd_lock(flags); for_each_cpu(cpu, &batch->cpumask) { unsigned long done; done = atomic_long_read(per_cpu_ptr(&ugen_done, cpu)); - if (ugen_before(done, ugen)) + if (ugen_before(done, ugen)) { + lufd_unlock(flags); return false; + } } + lufd_unlock(flags); return true; out: return cpumask_empty(&batch->cpumask); @@ -1280,10 +1304,12 @@ bool arch_tlbbatch_diet(struct arch_tlbflush_unmap_batch *batch, unsigned long ugen) { int cpu; + unsigned long flags; if (!ugen) goto out; + lufd_lock(flags); for_each_cpu(cpu, &batch->cpumask) { unsigned long done; @@ -1291,6 +1317,7 @@ bool arch_tlbbatch_diet(struct arch_tlbflush_unmap_batch *batch, if (!ugen_before(done, ugen)) cpumask_clear_cpu(cpu, &batch->cpumask); } + lufd_unlock(flags); out: return cpumask_empty(&batch->cpumask); } @@ -1299,10 +1326,12 @@ void arch_tlbbatch_mark_ugen(struct arch_tlbflush_unmap_batch *batch, unsigned long ugen) { int cpu; + unsigned long flags; if (!ugen) return; + lufd_lock(flags); for_each_cpu(cpu, &batch->cpumask) { atomic_long_t *done = per_cpu_ptr(&ugen_done, cpu); unsigned long old = atomic_long_read(done); @@ -1320,15 +1349,18 @@ void arch_tlbbatch_mark_ugen(struct arch_tlbflush_unmap_batch *batch, */ atomic_long_cmpxchg(done, old, ugen); } + lufd_unlock(flags); } void arch_mm_mark_ugen(struct mm_struct *mm, unsigned long ugen) { int cpu; + unsigned long flags; if (!ugen) return; + lufd_lock(flags); for_each_cpu(cpu, mm_cpumask(mm)) { atomic_long_t *done = per_cpu_ptr(&ugen_done, cpu); unsigned long old = atomic_long_read(done); @@ -1346,6 +1378,7 @@ void arch_mm_mark_ugen(struct mm_struct *mm, unsigned long ugen) */ atomic_long_cmpxchg(done, old, ugen); } + lufd_unlock(flags); } void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) diff --git a/include/linux/highmem-internal.h b/include/linux/highmem-internal.h index dd100e849f5e0..0792530d1be7b 100644 --- a/include/linux/highmem-internal.h +++ b/include/linux/highmem-internal.h @@ -41,6 +41,7 @@ static inline void *kmap(struct page *page) { void *addr; + lufd_check_pages(page, 0); might_sleep(); if (!PageHighMem(page)) addr = page_address(page); @@ -161,6 +162,7 @@ static inline struct page *kmap_to_page(void *addr) static inline void *kmap(struct page *page) { + lufd_check_pages(page, 0); might_sleep(); return page_address(page); } @@ -177,11 +179,13 @@ static inline void kunmap(struct page *page) static inline void *kmap_local_page(struct page *page) { + lufd_check_pages(page, 0); return page_address(page); } static inline void *kmap_local_folio(struct folio *folio, size_t offset) { + lufd_check_folio(folio); return page_address(&folio->page) + offset; } @@ -204,6 +208,7 @@ static inline void __kunmap_local(const void *addr) static inline void *kmap_atomic(struct page *page) { + lufd_check_pages(page, 0); if (IS_ENABLED(CONFIG_PREEMPT_RT)) migrate_disable(); else diff --git a/include/linux/mm.h b/include/linux/mm.h index 5c81c9831bc5d..9572fbbb9d73f 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -44,6 +44,24 @@ extern int sysctl_page_lock_unfairness; void mm_core_init(void); void init_mm_internals(void); +#ifdef CONFIG_LUF_DEBUG +void lufd_check_folio(struct folio *f); +void lufd_check_pages(const struct page *p, unsigned int order); +void lufd_check_zone_pages(struct zone *zone, struct page *page, unsigned int order); +void lufd_check_queued_pages(void); +void lufd_queue_page_for_check(struct page *page, int order); +void lufd_mark_folio(struct folio *f, unsigned short luf_key); +void lufd_mark_pages(struct page *p, unsigned int order, unsigned short luf_key); +#else +static inline void lufd_check_folio(struct folio *f) {} +static inline void lufd_check_pages(const struct page *p, unsigned int order) {} +static inline void lufd_check_zone_pages(struct zone *zone, struct page *page, unsigned int order) {} +static inline void lufd_check_queued_pages(void) {} +static inline void lufd_queue_page_for_check(struct page *page, int order) {} +static inline void lufd_mark_folio(struct folio *f, unsigned short luf_key) {} +static inline void lufd_mark_pages(struct page *p, unsigned int order, unsigned short luf_key) {} +#endif + #ifndef CONFIG_NUMA /* Don't use mapnrs, do it properly */ extern unsigned long max_mapnr; @@ -113,7 +131,7 @@ extern int mmap_rnd_compat_bits __read_mostly; #endif #ifndef page_to_virt -#define page_to_virt(x) __va(PFN_PHYS(page_to_pfn(x))) +#define page_to_virt(x) ({ lufd_check_pages(x, 0); __va(PFN_PHYS(page_to_pfn(x)));}) #endif #ifndef lm_alias diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index e3132e1e5e5d2..e0c5712dc46ff 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -22,6 +22,10 @@ #include <asm/mmu.h> +#ifdef CONFIG_LUF_DEBUG +extern struct page_ext_operations luf_debug_ops; +#endif + #ifndef AT_VECTOR_SIZE_ARCH #define AT_VECTOR_SIZE_ARCH 0 #endif @@ -32,18 +36,6 @@ struct address_space; struct mem_cgroup; -#ifdef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH -struct luf_batch { - struct tlbflush_unmap_batch batch; - unsigned long ugen; - rwlock_t lock; -}; -void luf_batch_init(struct luf_batch *lb); -#else -struct luf_batch {}; -static inline void luf_batch_init(struct luf_batch *lb) {} -#endif - /* * Each physical page in the system has a struct page associated with * it to keep track of whatever it is we are using the page for at the diff --git a/include/linux/mm_types_task.h b/include/linux/mm_types_task.h index bff5706b76e14..b5dfc451c009b 100644 --- a/include/linux/mm_types_task.h +++ b/include/linux/mm_types_task.h @@ -9,6 +9,7 @@ */ #include <linux/types.h> +#include <linux/spinlock_types.h> #include <asm/page.h> @@ -67,4 +68,19 @@ struct tlbflush_unmap_batch { #endif }; +#ifdef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH +struct luf_batch { + struct tlbflush_unmap_batch batch; + unsigned long ugen; + rwlock_t lock; +}; +void luf_batch_init(struct luf_batch *lb); +#else +struct luf_batch {}; +static inline void luf_batch_init(struct luf_batch *lb) {} +#endif + +#if defined(CONFIG_LUF_DEBUG) +#define NR_LUFD_PAGES 512 +#endif #endif /* _LINUX_MM_TYPES_TASK_H */ diff --git a/include/linux/sched.h b/include/linux/sched.h index 463cb2fb8f919..eb1487fa101e6 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1380,6 +1380,11 @@ struct task_struct { unsigned long luf_ugen; unsigned long zone_ugen; unsigned long wait_zone_ugen; +#if defined(CONFIG_LUF_DEBUG) + struct page *lufd_pages[NR_LUFD_PAGES]; + int lufd_pages_order[NR_LUFD_PAGES]; + int lufd_pages_nr; +#endif #endif struct tlbflush_unmap_batch tlb_ubc; diff --git a/mm/highmem.c b/mm/highmem.c index ef3189b36cadb..a323d5a655bf9 100644 --- a/mm/highmem.c +++ b/mm/highmem.c @@ -576,6 +576,7 @@ void *__kmap_local_page_prot(struct page *page, pgprot_t prot) { void *kmap; + lufd_check_pages(page, 0); /* * To broaden the usage of the actual kmap_local() machinery always map * pages when debugging is enabled and the architecture has no problems diff --git a/mm/memory.c b/mm/memory.c index c98af5e567e89..89d047867d60d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -6124,6 +6124,18 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address, mapping = vma->vm_file->f_mapping; } +#ifdef CONFIG_LUF_DEBUG + if (luf_flush) { + /* + * If it has a VM_SHARED mapping, all the mms involved + * in the struct address_space should be luf_flush'ed. + */ + if (mapping) + luf_flush_mapping(mapping); + luf_flush_mm(mm); + } +#endif + if (unlikely(is_vm_hugetlb_page(vma))) ret = hugetlb_fault(vma->vm_mm, vma, address, flags); else diff --git a/mm/page_alloc.c b/mm/page_alloc.c index ccbe49b78190a..c8ab60c60bb08 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -758,6 +758,8 @@ void luf_takeoff_end(struct zone *zone) VM_WARN_ON(current->zone_ugen); VM_WARN_ON(current->wait_zone_ugen); } + + lufd_check_queued_pages(); } /* @@ -853,8 +855,10 @@ bool luf_takeoff_check_and_fold(struct zone *zone, struct page *page) struct luf_batch *lb; unsigned long lb_ugen; - if (!luf_key) + if (!luf_key) { + lufd_check_pages(page, buddy_order(page)); return true; + } lb = &luf_batch[luf_key]; read_lock_irqsave(&lb->lock, flags); @@ -875,12 +879,15 @@ bool luf_takeoff_check_and_fold(struct zone *zone, struct page *page) if (!current->luf_ugen || ugen_before(current->luf_ugen, lb_ugen)) current->luf_ugen = lb_ugen; + lufd_queue_page_for_check(page, buddy_order(page)); return true; } zone_ugen = page_zone_ugen(zone, page); - if (!zone_ugen) + if (!zone_ugen) { + lufd_check_pages(page, buddy_order(page)); return true; + } /* * Should not be zero since zone-zone_ugen has been updated in @@ -888,17 +895,23 @@ bool luf_takeoff_check_and_fold(struct zone *zone, struct page *page) */ VM_WARN_ON(!zone->zone_ugen); - if (!ugen_before(READ_ONCE(zone->zone_ugen_done), zone_ugen)) + if (!ugen_before(READ_ONCE(zone->zone_ugen_done), zone_ugen)) { + lufd_check_pages(page, buddy_order(page)); return true; + } if (current->luf_no_shootdown) return false; + lufd_check_zone_pages(zone, page, buddy_order(page)); + /* * zone batched flush has been already set. */ - if (current->zone_ugen) + if (current->zone_ugen) { + lufd_queue_page_for_check(page, buddy_order(page)); return true; + } /* * Others are already performing tlb shootdown for us. All we @@ -933,6 +946,7 @@ bool luf_takeoff_check_and_fold(struct zone *zone, struct page *page) atomic_long_set(&zone->nr_luf_pages, 0); fold_batch(tlb_ubc_takeoff, &zone->zone_batch, true); } + lufd_queue_page_for_check(page, buddy_order(page)); return true; } #endif @@ -1238,6 +1252,11 @@ static inline void __free_one_page(struct page *page, } else zone_ugen = page_zone_ugen(zone, page); + if (!zone_ugen) + lufd_check_pages(page, order); + else + lufd_check_zone_pages(zone, page, order); + while (order < MAX_PAGE_ORDER) { int buddy_mt = migratetype; unsigned long buddy_zone_ugen; @@ -1299,6 +1318,10 @@ static inline void __free_one_page(struct page *page, set_page_zone_ugen(page, zone_ugen); pfn = combined_pfn; order++; + if (!zone_ugen) + lufd_check_pages(page, order); + else + lufd_check_zone_pages(zone, page, order); } done_merging: @@ -3201,6 +3224,8 @@ void free_unref_page(struct page *page, unsigned int order, unsigned long pfn = page_to_pfn(page); int migratetype; + lufd_mark_pages(page, order, luf_key); + if (!pcp_allowed_order(order)) { __free_pages_ok(page, order, FPI_NONE, luf_key); return; @@ -3253,6 +3278,7 @@ void free_unref_folios(struct folio_batch *folios, unsigned short luf_key) unsigned long pfn = folio_pfn(folio); unsigned int order = folio_order(folio); + lufd_mark_folio(folio, luf_key); if (!free_pages_prepare(&folio->page, order)) continue; /* diff --git a/mm/page_ext.c b/mm/page_ext.c index 641d93f6af4c1..be40bc2a93378 100644 --- a/mm/page_ext.c +++ b/mm/page_ext.c @@ -89,6 +89,9 @@ static struct page_ext_operations *page_ext_ops[] __initdata = { #ifdef CONFIG_PAGE_TABLE_CHECK &page_table_check_ops, #endif +#ifdef CONFIG_LUF_DEBUG + &luf_debug_ops, +#endif }; unsigned long page_ext_size; diff --git a/mm/rmap.c b/mm/rmap.c index 55003eb0b4936..fd6d5cb0fa8d0 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1161,6 +1161,235 @@ static bool should_defer_flush(struct mm_struct *mm, enum ttu_flags flags) } #endif /* CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH */ +#ifdef CONFIG_LUF_DEBUG + +static bool need_luf_debug(void) +{ + return true; +} + +static void init_luf_debug(void) +{ + /* Do nothing */ +} + +struct page_ext_operations luf_debug_ops = { + .size = sizeof(struct luf_batch), + .need = need_luf_debug, + .init = init_luf_debug, + .need_shared_flags = false, +}; + +static bool __lufd_check_zone_pages(struct page *page, int nr, + struct tlbflush_unmap_batch *batch, unsigned long ugen) +{ + int i; + + for (i = 0; i < nr; i++) { + struct page_ext *page_ext; + struct luf_batch *lb; + unsigned long lb_ugen; + unsigned long flags; + bool ret; + + page_ext = page_ext_get(page + i); + if (!page_ext) + continue; + + lb = (struct luf_batch *)page_ext_data(page_ext, &luf_debug_ops); + write_lock_irqsave(&lb->lock, flags); + lb_ugen = lb->ugen; + ret = arch_tlbbatch_done(&lb->batch.arch, &batch->arch); + write_unlock_irqrestore(&lb->lock, flags); + page_ext_put(page_ext); + + if (!ret || ugen_before(ugen, lb_ugen)) + return false; + } + return true; +} + +void lufd_check_zone_pages(struct zone *zone, struct page *page, unsigned int order) +{ + bool warn; + static bool once = false; + + if (!page || !zone) + return; + + warn = !__lufd_check_zone_pages(page, 1 << order, + &zone->zone_batch, zone->luf_ugen); + + if (warn && !READ_ONCE(once)) { + WRITE_ONCE(once, true); + VM_WARN(1, "LUFD: ugen(%lu) page(%p) order(%u)\n", + atomic_long_read(&luf_ugen), page, order); + print_lufd_arch(); + } +} + +static bool __lufd_check_pages(const struct page *page, int nr) +{ + int i; + + for (i = 0; i < nr; i++) { + struct page_ext *page_ext; + struct luf_batch *lb; + unsigned long lb_ugen; + unsigned long flags; + bool ret; + + page_ext = page_ext_get(page + i); + if (!page_ext) + continue; + + lb = (struct luf_batch *)page_ext_data(page_ext, &luf_debug_ops); + write_lock_irqsave(&lb->lock, flags); + lb_ugen = lb->ugen; + ret = arch_tlbbatch_diet(&lb->batch.arch, lb_ugen); + write_unlock_irqrestore(&lb->lock, flags); + page_ext_put(page_ext); + + if (!ret) + return false; + } + return true; +} + +void lufd_queue_page_for_check(struct page *page, int order) +{ + struct page **parray = current->lufd_pages; + int *oarray = current->lufd_pages_order; + + if (!page) + return; + + if (current->lufd_pages_nr >= NR_LUFD_PAGES) { + VM_WARN_ONCE(1, "LUFD: NR_LUFD_PAGES is too small.\n"); + return; + } + + *(parray + current->lufd_pages_nr) = page; + *(oarray + current->lufd_pages_nr) = order; + current->lufd_pages_nr++; +} + +void lufd_check_queued_pages(void) +{ + struct page **parray = current->lufd_pages; + int *oarray = current->lufd_pages_order; + int i; + + for (i = 0; i < current->lufd_pages_nr; i++) + lufd_check_pages(*(parray + i), *(oarray + i)); + current->lufd_pages_nr = 0; +} + +void lufd_check_folio(struct folio *folio) +{ + struct page *page; + int nr; + bool warn; + static bool once = false; + + if (!folio) + return; + + page = folio_page(folio, 0); + nr = folio_nr_pages(folio); + + warn = !__lufd_check_pages(page, nr); + + if (warn && !READ_ONCE(once)) { + WRITE_ONCE(once, true); + VM_WARN(1, "LUFD: ugen(%lu) page(%p) nr(%d)\n", + atomic_long_read(&luf_ugen), page, nr); + print_lufd_arch(); + } +} +EXPORT_SYMBOL(lufd_check_folio); + +void lufd_check_pages(const struct page *page, unsigned int order) +{ + bool warn; + static bool once = false; + + if (!page) + return; + + warn = !__lufd_check_pages(page, 1 << order); + + if (warn && !READ_ONCE(once)) { + WRITE_ONCE(once, true); + VM_WARN(1, "LUFD: ugen(%lu) page(%p) order(%u)\n", + atomic_long_read(&luf_ugen), page, order); + print_lufd_arch(); + } +} +EXPORT_SYMBOL(lufd_check_pages); + +static void __lufd_mark_pages(struct page *page, int nr, unsigned short luf_key) +{ + int i; + + for (i = 0; i < nr; i++) { + struct page_ext *page_ext; + struct luf_batch *lb; + + page_ext = page_ext_get(page + i); + if (!page_ext) + continue; + + lb = (struct luf_batch *)page_ext_data(page_ext, &luf_debug_ops); + fold_luf_batch(lb, &luf_batch[luf_key]); + page_ext_put(page_ext); + } +} + +void lufd_mark_folio(struct folio *folio, unsigned short luf_key) +{ + struct page *page; + int nr; + bool warn; + static bool once = false; + + if (!luf_key) + return; + + page = folio_page(folio, 0); + nr = folio_nr_pages(folio); + + warn = !__lufd_check_pages(page, nr); + __lufd_mark_pages(page, nr, luf_key); + + if (warn && !READ_ONCE(once)) { + WRITE_ONCE(once, true); + VM_WARN(1, "LUFD: ugen(%lu) page(%p) nr(%d)\n", + atomic_long_read(&luf_ugen), page, nr); + print_lufd_arch(); + } +} + +void lufd_mark_pages(struct page *page, unsigned int order, unsigned short luf_key) +{ + bool warn; + static bool once = false; + + if (!luf_key) + return; + + warn = !__lufd_check_pages(page, 1 << order); + __lufd_mark_pages(page, 1 << order, luf_key); + + if (warn && !READ_ONCE(once)) { + WRITE_ONCE(once, true); + VM_WARN(1, "LUFD: ugen(%lu) page(%p) order(%u)\n", + atomic_long_read(&luf_ugen), page, order); + print_lufd_arch(); + } +} +#endif + /** * page_address_in_vma - The virtual address of a page in this VMA. * @folio: The folio containing the page.

[RFC,v12,26/26] mm/luf: implement luf debug feature

Commit Message

Patch