Message ID | 20240903164532.3874988-1-scott@os.amperecomputing.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] arm64: Expose the end of the linear map in PHYSMEM_END | expand |
On Tue, Sep 03, 2024 at 09:45:32AM -0700, D Scott Phillips wrote: > The memory hot-plug and resource management code needs to know the > largest address which can fit in the linear map, so set > PHYSMEM_END for that purpose. > > This fixes a crash[1] at boot when amdgpu tries to create > DEVICE_PRIVATE_MEMORY and is given a physical address by the > resource management code which is outside the range which can have > a `struct page` > > The Fixes: commit listed below isn't actually broken, but the > reorganization of vmemmap causes the improper DEVICE_PRIVATE_MEMORY address > to go from a warning to a crash. > > [1]: Unable to handle kernel paging request at virtual address No need to have [1]: prefix here and also read this https://www.kernel.org/doc/html/latest/process/submitting-patches.html#backtraces-in-commit-messages and amend commit message accordingly. > 000001ffa6000034 > Mem abort info: > ESR = 0x0000000096000044 > EC = 0x25: DABT (current EL), IL = 32 bits > SET = 0, FnV = 0 > EA = 0, S1PTW = 0 > FSC = 0x04: level 0 translation fault > Data abort info: > ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000 > CM = 0, WnR = 1, TnD = 0, TagAccess = 0 > GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 > user pgtable: 4k pages, 48-bit VAs, pgdp=000008000287c000 > [000001ffa6000034] pgd=0000000000000000, p4d=0000000000000000 > Call trace: > __init_zone_device_page.constprop.0+0x2c/0xa8 > memmap_init_zone_device+0xf0/0x210 > pagemap_range+0x1e0/0x410 > memremap_pages+0x18c/0x2e0 > devm_memremap_pages+0x30/0x90 > kgd2kfd_init_zone_device+0xf0/0x200 [amdgpu] > amdgpu_device_ip_init+0x674/0x888 [amdgpu] > amdgpu_device_init+0x7a4/0xea0 [amdgpu] > amdgpu_driver_load_kms+0x28/0x1c0 [amdgpu] > amdgpu_pci_probe+0x1a0/0x560 [amdgpu] > local_pci_probe+0x48/0xb8 > work_for_cpu_fn+0x24/0x40 > process_one_work+0x170/0x3e0 > worker_thread+0x2ac/0x3e0 > kthread+0xf4/0x108 > ret_from_fork+0x10/0x20
On Tue, 03 Sep 2024 09:45:32 -0700, D Scott Phillips wrote: > The memory hot-plug and resource management code needs to know the > largest address which can fit in the linear map, so set > PHYSMEM_END for that purpose. > > This fixes a crash[1] at boot when amdgpu tries to create > DEVICE_PRIVATE_MEMORY and is given a physical address by the > resource management code which is outside the range which can have > a `struct page` > > [...] Applied to arm64 (for-next/mm), thanks! I dropped the cc: stable, however, as PHYSMEM_END looks like it only exists in linux-next. [1/1] arm64: Expose the end of the linear map in PHYSMEM_END https://git.kernel.org/arm64/c/eeb8fdfcf090 Cheers,
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index 54fb014eba05..0480c61dbb4f 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -110,6 +110,8 @@ #define PAGE_END (_PAGE_END(VA_BITS_MIN)) #endif /* CONFIG_KASAN */ +#define PHYSMEM_END __pa(PAGE_END - 1) + #define MIN_THREAD_SHIFT (14 + KASAN_THREAD_SHIFT) /*
The memory hot-plug and resource management code needs to know the largest address which can fit in the linear map, so set PHYSMEM_END for that purpose. This fixes a crash[1] at boot when amdgpu tries to create DEVICE_PRIVATE_MEMORY and is given a physical address by the resource management code which is outside the range which can have a `struct page` The Fixes: commit listed below isn't actually broken, but the reorganization of vmemmap causes the improper DEVICE_PRIVATE_MEMORY address to go from a warning to a crash. [1]: Unable to handle kernel paging request at virtual address 000001ffa6000034 Mem abort info: ESR = 0x0000000096000044 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault Data abort info: ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000 CM = 0, WnR = 1, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=000008000287c000 [000001ffa6000034] pgd=0000000000000000, p4d=0000000000000000 Call trace: __init_zone_device_page.constprop.0+0x2c/0xa8 memmap_init_zone_device+0xf0/0x210 pagemap_range+0x1e0/0x410 memremap_pages+0x18c/0x2e0 devm_memremap_pages+0x30/0x90 kgd2kfd_init_zone_device+0xf0/0x200 [amdgpu] amdgpu_device_ip_init+0x674/0x888 [amdgpu] amdgpu_device_init+0x7a4/0xea0 [amdgpu] amdgpu_driver_load_kms+0x28/0x1c0 [amdgpu] amdgpu_pci_probe+0x1a0/0x560 [amdgpu] local_pci_probe+0x48/0xb8 work_for_cpu_fn+0x24/0x40 process_one_work+0x170/0x3e0 worker_thread+0x2ac/0x3e0 kthread+0xf4/0x108 ret_from_fork+0x10/0x20 Fixes: 32697ff38287 ("arm64: vmemmap: Avoid base2 order of struct page size to dimension region") Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com> Cc: stable@vger.kernel.org --- Link to v2: https://lore.kernel.org/all/20240709002757.2431399-1-scott@os.amperecomputing.com/ Changes since v1: - Change approach again to defining the newly created PHYSMEM_END in arch/arm64/include/asm/memory.h Link to v1: https://lore.kernel.org/all/20240703210707.1986816-1-scott@os.amperecomputing.com/ Changes since v1: - Change from fiddling the architecture's MAX_PHYSMEM_BITS to checking arch_get_mappable_range(). arch/arm64/include/asm/memory.h | 2 ++ 1 file changed, 2 insertions(+)