Message ID | 20240223063608.2605736-1-liuyongqiang13@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] arm: flush: check if the folio is reserved for no-mapping addresses | expand |
Since some abuses of pfn_valid() have been reported, I check all the use of pfn_valid(), and find some suspicious cases. phys_mem_access_prot() defined in arch/arm/mm/mmu.c returns pgprot_noncached() when pfn_valid() returns false. I think it’s purpose is to return pgprot_noncached() when the pfn is not in RAM, and the use of pfn_valid() is incorrect. Notably, phys_mem_access_prot() defined in arm64 uses pfn_is_map_memory() instead of pfn_valid() since commit 873ba463914c (arm64: decouple check whether pfn is in linear map from pfn_valid()). Similarly, virt_addr_valid() defined in arm64 uses pfn_is_map_memory() instead of pfn_valid() since commit 873ba463914c (arm64: decouple check whether pfn is in linear map from pfn_valid()), But virt_addr_valid() still uses pfn_valid(). Besides, the implementation of x86 also uses pfn_valid(). update_mmu_cache_range() defined in arch/arm/mm/fault-armv.c checks pfn_valid() and then calls __flush_dcache_folio(). This case is similar to the case reported by Yongqiang Liu, the pfn may not be a RAM pfn, and the system will crash in __flush_dcache_folio() due to the kernel linear mapping is not established. virt_addr_valid() is used to check whether a vrtual address is valid linear mapping. Are these uses of pfn_valid() incorrect? pfn_modify_allowed() defined in arch/x86/mm/mmap.c checks pfn_valid(), and the comment says it is intended to check whether the pfn is in real memory. So the use of pfn_valid() should be incorrent. This case is only involved when the cpu is affected by X86_BUG_L1TF. try_ram_remap() defined in kernel/iomem.c returns the linear address when three checks are passed. One of the checks is pfn_valid(). The only caller memremap() guarantees the pfn passed to try_ram_remap() is in RAM, but the pfn may be in NOMAP memory regions and is not mapped in linear mapping. commit 260364d112bc (arm[64]/memremap: don't abuse pfn_valid() to ensure presence of linear map) solves it by checking in arch_memremap_can_ram_remap(), However, if other architectures involve this issue? Do these suspicious case abuse pfn_valid() really? Thanks
On Mon, Feb 26, 2024 at 02:38:58PM +0800, Jinjiang Tu wrote: > Since some abuses of pfn_valid() have been reported, I check all the use of > pfn_valid(), and find some suspicious cases. I do get really tired of kernel interfaces migrating to become something different from what they were when code was originally written, and then having users of that interface labelled as "suspicious" or an "abuse". I don't follow MM stuff, so I can't comment on the rights or wrongs of this, but what I understood was that pfn_valid() is there to check that for a given PFN, pfn_to_page() would return a valid pointer to a struct page. Given that we only have struct page's for memory which the kernel is managing, this seems entirely correct. There may be other RAM in the system which is being managed via different mechanisms, and because those won't have a struct page associated with them, pfn_valid() should be returning false (which means memory carved out for e.g. other processors etc) won't be mapped cacheable. Or at least that's how things used to be - because 32-bit Arm's pfn_valid() was implemented by checking memblock for memory, and stolen memory _was_ removed from memblock.memory (see arm_memblock_steal) or quite simply these areas were not passed to the kernel as memory.
diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c index d19d140a10c7..0749cf8a6637 100644 --- a/arch/arm/mm/flush.c +++ b/arch/arm/mm/flush.c @@ -296,6 +296,9 @@ void __sync_icache_dcache(pte_t pteval) return; folio = page_folio(pfn_to_page(pfn)); + if (folio_test_reserved(folio)) + return; + if (cache_is_vipt_aliasing()) mapping = folio_flush_mapping(folio); else
Since commit a4d5613c4dc6 ("arm: extend pfn_valid to take into account freed memory map alignment") changes the semantics of pfn_valid() to check presence of the memory map for a PFN. A valid page for an address which is reserved but not mapped by the kernel[1], the system crashed during some uio test with the following memory layout: node 0: [mem 0x00000000c0a00000-0x00000000cc8fffff] node 0: [mem 0x00000000d0000000-0x00000000da1fffff] the uio layout is:0xc0900000, 0x100000 the crash backtrace like: Unable to handle kernel paging request at virtual address bff00000 [...] CPU: 1 PID: 465 Comm: startapp.bin Tainted: G O 5.10.0 #1 Hardware name: Generic DT based system PC is at b15_flush_kern_dcache_area+0x24/0x3c LR is at __sync_icache_dcache+0x6c/0x98 [...] (b15_flush_kern_dcache_area) from (__sync_icache_dcache+0x6c/0x98) (__sync_icache_dcache) from (set_pte_at+0x28/0x54) (set_pte_at) from (remap_pfn_range+0x1a0/0x274) (remap_pfn_range) from (uio_mmap+0x184/0x1b8 [uio]) (uio_mmap [uio]) from (__mmap_region+0x264/0x5f4) (__mmap_region) from (__do_mmap_mm+0x3ec/0x440) (__do_mmap_mm) from (do_mmap+0x50/0x58) (do_mmap) from (vm_mmap_pgoff+0xfc/0x188) (vm_mmap_pgoff) from (ksys_mmap_pgoff+0xac/0xc4) (ksys_mmap_pgoff) from (ret_fast_syscall+0x0/0x5c) Code: e0801001 e2423001 e1c00003 f57ff04f (ee070f3e) ---[ end trace 09cf0734c3805d52 ]--- Kernel panic - not syncing: Fatal exception So check if PG_reserved was set to solve this issue. [1]: https://lore.kernel.org/lkml/Zbtdue57RO0QScJM@linux.ibm.com/ Fixes: a4d5613c4dc6 ("arm: extend pfn_valid to take into account freed memory map alignment") Suggested-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com> --- v1 -> v2 - Improve commit message - Use helpers instead of PG_reserved - drop the unnecessary include headers arch/arm/mm/flush.c | 3 +++ 1 file changed, 3 insertions(+)