Message ID | 20241206161210.163701-1-yazen.ghannam@amd.com (mailing list archive) |
---|---|
Headers | show |
Series | AMD NB and SMN rework | expand |
On Fri, Dec 06, 2024 at 04:11:53PM +0000, Yazen Ghannam wrote: > Hi all, > > The theme of this set is decoupling the "AMD node" concept from the > legacy northbridge support. > > Additionally, AMD System Management Network (SMN) access code is > decoupled and expanded too. > > Patches 1-3 begin reducing the scope of AMD_NB. > > Patches 4-9 begin moving generic AMD node support out of AMD_NB. > > Patches 10-13 move SMN support out of AMD_NB and do some refactoring. > > Patch 14 has HSMP reuse SMN functionality. > > Patches 15-16 address userspace access to SMN. So I took the first patch and then booting the first 13 with the intention to queue them while the remaining three are still being discussed, is causing the below in my guest. .config is attached, I've pushed the branch here too, if you wanna test with it: https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=tip-x86-misc [ 0.897060] cirrus 0000:00:01.0: [drm] fb0: cirrusdrmfb frame buffer device [ 0.900310] BUG: kernel NULL pointer dereference, address: 00000000000000c4 [ 0.902551] #PF: supervisor read access in kernel mode [ 0.904096] #PF: error_code(0x0000) - not-present page [ 0.904268] PGD 0 P4D 0 [ 0.904268] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI [ 0.904268] CPU: 0 UID: 0 PID: 20 Comm: cpuhp/0 Not tainted 6.13.0-rc1+ #1 [ 0.904268] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2023.11-8 02/21/2024 [ 0.904268] RIP: 0010:pci_read_config_dword+0x9/0x40 [ 0.904268] Code: 00 00 e9 8a f9 57 00 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 <8b> 87 c4 00 00 00 48 89 d1 83 f8 03 74 10 8b 47 38 48 8b 7f 10 89 [ 0.904268] RSP: 0018:ffffc9000012fcd8 EFLAGS: 00010246 [ 0.904268] RAX: 0000000000000000 RBX: ffff88800d296640 RCX: 000000000000003f [ 0.904268] RDX: ffffc9000012fce4 RSI: 00000000000001c4 RDI: 0000000000000000 [ 0.904268] RBP: ffffc9000012fd60 R08: 0000000000000040 R09: 0000000000000010 [ 0.904268] R10: ffff88800daa1eb0 R11: fffffffffff8dc6f R12: 0000000040000163 [ 0.904268] R13: ffffc9000012fd60 R14: 0000000000000000 R15: ffff88807d62fc90 [ 0.904268] FS: 0000000000000000(0000) GS:ffff88807d600000(0000) knlGS:0000000000000000 [ 0.904268] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.904268] CR2: 00000000000000c4 CR3: 0000000002c1a000 CR4: 00000000003506f0 [ 0.904268] Call Trace: [ 0.904268] <TASK> [ 0.904268] ? __die+0x31/0x80 [ 0.904268] ? page_fault_oops+0x15d/0x4f0 [ 0.904268] ? srso_return_thunk+0x5/0x5f [ 0.904268] ? ttwu_queue_wakelist+0xf7/0x100 [ 0.904268] ? exc_page_fault+0x78/0x150 [ 0.904268] ? asm_exc_page_fault+0x26/0x30 [ 0.904268] ? pci_read_config_dword+0x9/0x40 [ 0.904268] ? srso_return_thunk+0x5/0x5f [ 0.904268] amd_init_l3_cache.part.0+0x6a/0x110 [ 0.904268] cpuid4_cache_lookup_regs+0xcf/0x2a0 [ 0.904268] populate_cache_leaves+0x6f/0x530 [ 0.904268] ? srso_return_thunk+0x5/0x5f [ 0.904268] ? dl_server_stop+0x2f/0x40 [ 0.904268] ? srso_return_thunk+0x5/0x5f [ 0.904268] detect_cache_attributes+0x97/0x330 [ 0.904268] ? __pfx_cacheinfo_cpu_online+0x10/0x10 [ 0.904268] cacheinfo_cpu_online+0x22/0x250 [ 0.904268] ? srso_return_thunk+0x5/0x5f [ 0.904268] ? __pfx_cacheinfo_cpu_online+0x10/0x10 [ 0.904268] cpuhp_invoke_callback+0x10f/0x480 [ 0.904268] ? try_to_wake_up+0x23b/0x540 [ 0.904268] cpuhp_thread_fun+0xd4/0x160 [ 0.904268] smpboot_thread_fn+0xdd/0x1f0 [ 0.904268] ? __pfx_smpboot_thread_fn+0x10/0x10 [ 0.904268] kthread+0xca/0xf0 [ 0.904268] ? __pfx_kthread+0x10/0x10 [ 0.904268] ret_from_fork+0x50/0x60 [ 0.904268] ? __pfx_kthread+0x10/0x10 [ 0.904268] ret_from_fork_asm+0x1a/0x30 [ 0.904268] </TASK> [ 0.904268] Modules linked in: [ 0.904268] CR2: 00000000000000c4 [ 0.904268] ---[ end trace 0000000000000000 ]--- [ 0.904268] RIP: 0010:pci_read_config_dword+0x9/0x40 [ 0.904268] Code: 00 00 e9 8a f9 57 00 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 <8b> 87 c4 00 00 00 48 89 d1 83 f8 03 74 10 8b 47 38 48 8b 7f 10 89 [ 0.988792] RSP: 0018:ffffc9000012fcd8 EFLAGS: 00010246 [ 0.988792] RAX: 0000000000000000 RBX: ffff88800d296640 RCX: 000000000000003f [ 0.988792] RDX: ffffc9000012fce4 RSI: 00000000000001c4 RDI: 0000000000000000 [ 0.988792] RBP: ffffc9000012fd60 R08: 0000000000000040 R09: 0000000000000010 [ 0.992761] R10: ffff88800daa1eb0 R11: fffffffffff8dc6f R12: 0000000040000163 [ 0.992761] R13: ffffc9000012fd60 R14: 0000000000000000 R15: ffff88807d62fc90 [ 0.992761] FS: 0000000000000000(0000) GS:ffff88807d600000(0000) knlGS:0000000000000000 [ 0.996772] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.996772] CR2: 00000000000000c4 CR3: 0000000002c1a000 CR4: 00000000003506f0 [ 0.996772] note: cpuhp/0[20] exited with irqs disabled [ 1.680874] tsc: Refined TSC clocksource calibration: 3700.028 MHz [ 1.683128] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x6aaae08e541, max_idle_ns: 881590514464 ns [ 1.688137] clocksource: Switched to clocksource tsc
On Fri, Jan 03, 2025 at 10:49:25PM +0100, Borislav Petkov wrote: > On Fri, Dec 06, 2024 at 04:11:53PM +0000, Yazen Ghannam wrote: > > Hi all, > > > > The theme of this set is decoupling the "AMD node" concept from the > > legacy northbridge support. > > > > Additionally, AMD System Management Network (SMN) access code is > > decoupled and expanded too. > > > > Patches 1-3 begin reducing the scope of AMD_NB. > > > > Patches 4-9 begin moving generic AMD node support out of AMD_NB. > > > > Patches 10-13 move SMN support out of AMD_NB and do some refactoring. > > > > Patch 14 has HSMP reuse SMN functionality. > > > > Patches 15-16 address userspace access to SMN. > > So I took the first patch and then booting the first 13 with the intention to > queue them while the remaining three are still being discussed, is causing the > below in my guest. > > .config is attached, I've pushed the branch here too, if you wanna test with > it: > > https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=tip-x86-misc > > [ 0.897060] cirrus 0000:00:01.0: [drm] fb0: cirrusdrmfb frame buffer device > [ 0.900310] BUG: kernel NULL pointer dereference, address: 00000000000000c4 > [ 0.902551] #PF: supervisor read access in kernel mode > [ 0.904096] #PF: error_code(0x0000) - not-present page > [ 0.904268] PGD 0 P4D 0 > [ 0.904268] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI > [ 0.904268] CPU: 0 UID: 0 PID: 20 Comm: cpuhp/0 Not tainted 6.13.0-rc1+ #1 > [ 0.904268] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2023.11-8 02/21/2024 > [ 0.904268] RIP: 0010:pci_read_config_dword+0x9/0x40 > [ 0.904268] Code: 00 00 e9 8a f9 57 00 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 <8b> 87 c4 00 00 00 48 89 d1 83 f8 03 74 10 8b 47 38 48 8b 7f 10 89 > [ 0.904268] RSP: 0018:ffffc9000012fcd8 EFLAGS: 00010246 > [ 0.904268] RAX: 0000000000000000 RBX: ffff88800d296640 RCX: 000000000000003f > [ 0.904268] RDX: ffffc9000012fce4 RSI: 00000000000001c4 RDI: 0000000000000000 > [ 0.904268] RBP: ffffc9000012fd60 R08: 0000000000000040 R09: 0000000000000010 > [ 0.904268] R10: ffff88800daa1eb0 R11: fffffffffff8dc6f R12: 0000000040000163 > [ 0.904268] R13: ffffc9000012fd60 R14: 0000000000000000 R15: ffff88807d62fc90 > [ 0.904268] FS: 0000000000000000(0000) GS:ffff88807d600000(0000) knlGS:0000000000000000 > [ 0.904268] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 0.904268] CR2: 00000000000000c4 CR3: 0000000002c1a000 CR4: 00000000003506f0 > [ 0.904268] Call Trace: > [ 0.904268] <TASK> > [ 0.904268] ? __die+0x31/0x80 > [ 0.904268] ? page_fault_oops+0x15d/0x4f0 > [ 0.904268] ? srso_return_thunk+0x5/0x5f > [ 0.904268] ? ttwu_queue_wakelist+0xf7/0x100 > [ 0.904268] ? exc_page_fault+0x78/0x150 > [ 0.904268] ? asm_exc_page_fault+0x26/0x30 > [ 0.904268] ? pci_read_config_dword+0x9/0x40 > [ 0.904268] ? srso_return_thunk+0x5/0x5f > [ 0.904268] amd_init_l3_cache.part.0+0x6a/0x110 > [ 0.904268] cpuid4_cache_lookup_regs+0xcf/0x2a0 > [ 0.904268] populate_cache_leaves+0x6f/0x530 > [ 0.904268] ? srso_return_thunk+0x5/0x5f > [ 0.904268] ? dl_server_stop+0x2f/0x40 > [ 0.904268] ? srso_return_thunk+0x5/0x5f > [ 0.904268] detect_cache_attributes+0x97/0x330 > [ 0.904268] ? __pfx_cacheinfo_cpu_online+0x10/0x10 > [ 0.904268] cacheinfo_cpu_online+0x22/0x250 > [ 0.904268] ? srso_return_thunk+0x5/0x5f > [ 0.904268] ? __pfx_cacheinfo_cpu_online+0x10/0x10 > [ 0.904268] cpuhp_invoke_callback+0x10f/0x480 > [ 0.904268] ? try_to_wake_up+0x23b/0x540 > [ 0.904268] cpuhp_thread_fun+0xd4/0x160 > [ 0.904268] smpboot_thread_fn+0xdd/0x1f0 > [ 0.904268] ? __pfx_smpboot_thread_fn+0x10/0x10 > [ 0.904268] kthread+0xca/0xf0 > [ 0.904268] ? __pfx_kthread+0x10/0x10 > [ 0.904268] ret_from_fork+0x50/0x60 > [ 0.904268] ? __pfx_kthread+0x10/0x10 > [ 0.904268] ret_from_fork_asm+0x1a/0x30 > [ 0.904268] </TASK> > [ 0.904268] Modules linked in: > [ 0.904268] CR2: 00000000000000c4 > [ 0.904268] ---[ end trace 0000000000000000 ]--- > [ 0.904268] RIP: 0010:pci_read_config_dword+0x9/0x40 > [ 0.904268] Code: 00 00 e9 8a f9 57 00 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 <8b> 87 c4 00 00 00 48 89 d1 83 f8 03 74 10 8b 47 38 48 8b 7f 10 89 > [ 0.988792] RSP: 0018:ffffc9000012fcd8 EFLAGS: 00010246 > [ 0.988792] RAX: 0000000000000000 RBX: ffff88800d296640 RCX: 000000000000003f > [ 0.988792] RDX: ffffc9000012fce4 RSI: 00000000000001c4 RDI: 0000000000000000 > [ 0.988792] RBP: ffffc9000012fd60 R08: 0000000000000040 R09: 0000000000000010 > [ 0.992761] R10: ffff88800daa1eb0 R11: fffffffffff8dc6f R12: 0000000040000163 > [ 0.992761] R13: ffffc9000012fd60 R14: 0000000000000000 R15: ffff88807d62fc90 > [ 0.992761] FS: 0000000000000000(0000) GS:ffff88807d600000(0000) knlGS:0000000000000000 > [ 0.996772] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 0.996772] CR2: 00000000000000c4 CR3: 0000000002c1a000 CR4: 00000000003506f0 > [ 0.996772] note: cpuhp/0[20] exited with irqs disabled > [ 1.680874] tsc: Refined TSC clocksource calibration: 3700.028 MHz > [ 1.683128] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x6aaae08e541, max_idle_ns: 881590514464 ns > [ 1.688137] clocksource: Switched to clocksource tsc > > Can you please share the guest parameters? Thanks, Yazen
On Mon, Jan 06, 2025 at 10:38:45AM -0500, Yazen Ghannam wrote: > On Fri, Jan 03, 2025 at 10:49:25PM +0100, Borislav Petkov wrote: > > On Fri, Dec 06, 2024 at 04:11:53PM +0000, Yazen Ghannam wrote: > > > Hi all, > > > > > > The theme of this set is decoupling the "AMD node" concept from the > > > legacy northbridge support. > > > > > > Additionally, AMD System Management Network (SMN) access code is > > > decoupled and expanded too. > > > > > > Patches 1-3 begin reducing the scope of AMD_NB. > > > > > > Patches 4-9 begin moving generic AMD node support out of AMD_NB. > > > > > > Patches 10-13 move SMN support out of AMD_NB and do some refactoring. > > > > > > Patch 14 has HSMP reuse SMN functionality. > > > > > > Patches 15-16 address userspace access to SMN. > > > > So I took the first patch and then booting the first 13 with the intention to > > queue them while the remaining three are still being discussed, is causing the > > below in my guest. > > > > .config is attached, I've pushed the branch here too, if you wanna test with > > it: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=tip-x86-misc > > > > [ 0.897060] cirrus 0000:00:01.0: [drm] fb0: cirrusdrmfb frame buffer device > > [ 0.900310] BUG: kernel NULL pointer dereference, address: 00000000000000c4 > > [ 0.902551] #PF: supervisor read access in kernel mode > > [ 0.904096] #PF: error_code(0x0000) - not-present page > > [ 0.904268] PGD 0 P4D 0 > > [ 0.904268] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI > > [ 0.904268] CPU: 0 UID: 0 PID: 20 Comm: cpuhp/0 Not tainted 6.13.0-rc1+ #1 > > [ 0.904268] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2023.11-8 02/21/2024 > > [ 0.904268] RIP: 0010:pci_read_config_dword+0x9/0x40 > > [ 0.904268] Code: 00 00 e9 8a f9 57 00 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 <8b> 87 c4 00 00 00 48 89 d1 83 f8 03 74 10 8b 47 38 48 8b 7f 10 89 > > [ 0.904268] RSP: 0018:ffffc9000012fcd8 EFLAGS: 00010246 > > [ 0.904268] RAX: 0000000000000000 RBX: ffff88800d296640 RCX: 000000000000003f > > [ 0.904268] RDX: ffffc9000012fce4 RSI: 00000000000001c4 RDI: 0000000000000000 > > [ 0.904268] RBP: ffffc9000012fd60 R08: 0000000000000040 R09: 0000000000000010 > > [ 0.904268] R10: ffff88800daa1eb0 R11: fffffffffff8dc6f R12: 0000000040000163 > > [ 0.904268] R13: ffffc9000012fd60 R14: 0000000000000000 R15: ffff88807d62fc90 > > [ 0.904268] FS: 0000000000000000(0000) GS:ffff88807d600000(0000) knlGS:0000000000000000 > > [ 0.904268] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 0.904268] CR2: 00000000000000c4 CR3: 0000000002c1a000 CR4: 00000000003506f0 > > [ 0.904268] Call Trace: > > [ 0.904268] <TASK> > > [ 0.904268] ? __die+0x31/0x80 > > [ 0.904268] ? page_fault_oops+0x15d/0x4f0 > > [ 0.904268] ? srso_return_thunk+0x5/0x5f > > [ 0.904268] ? ttwu_queue_wakelist+0xf7/0x100 > > [ 0.904268] ? exc_page_fault+0x78/0x150 > > [ 0.904268] ? asm_exc_page_fault+0x26/0x30 > > [ 0.904268] ? pci_read_config_dword+0x9/0x40 > > [ 0.904268] ? srso_return_thunk+0x5/0x5f > > [ 0.904268] amd_init_l3_cache.part.0+0x6a/0x110 > > [ 0.904268] cpuid4_cache_lookup_regs+0xcf/0x2a0 > > [ 0.904268] populate_cache_leaves+0x6f/0x530 > > [ 0.904268] ? srso_return_thunk+0x5/0x5f > > [ 0.904268] ? dl_server_stop+0x2f/0x40 > > [ 0.904268] ? srso_return_thunk+0x5/0x5f > > [ 0.904268] detect_cache_attributes+0x97/0x330 > > [ 0.904268] ? __pfx_cacheinfo_cpu_online+0x10/0x10 > > [ 0.904268] cacheinfo_cpu_online+0x22/0x250 > > [ 0.904268] ? srso_return_thunk+0x5/0x5f > > [ 0.904268] ? __pfx_cacheinfo_cpu_online+0x10/0x10 > > [ 0.904268] cpuhp_invoke_callback+0x10f/0x480 > > [ 0.904268] ? try_to_wake_up+0x23b/0x540 > > [ 0.904268] cpuhp_thread_fun+0xd4/0x160 > > [ 0.904268] smpboot_thread_fn+0xdd/0x1f0 > > [ 0.904268] ? __pfx_smpboot_thread_fn+0x10/0x10 > > [ 0.904268] kthread+0xca/0xf0 > > [ 0.904268] ? __pfx_kthread+0x10/0x10 > > [ 0.904268] ret_from_fork+0x50/0x60 > > [ 0.904268] ? __pfx_kthread+0x10/0x10 > > [ 0.904268] ret_from_fork_asm+0x1a/0x30 > > [ 0.904268] </TASK> > > [ 0.904268] Modules linked in: > > [ 0.904268] CR2: 00000000000000c4 > > [ 0.904268] ---[ end trace 0000000000000000 ]--- > > [ 0.904268] RIP: 0010:pci_read_config_dword+0x9/0x40 > > [ 0.904268] Code: 00 00 e9 8a f9 57 00 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 <8b> 87 c4 00 00 00 48 89 d1 83 f8 03 74 10 8b 47 38 48 8b 7f 10 89 > > [ 0.988792] RSP: 0018:ffffc9000012fcd8 EFLAGS: 00010246 > > [ 0.988792] RAX: 0000000000000000 RBX: ffff88800d296640 RCX: 000000000000003f > > [ 0.988792] RDX: ffffc9000012fce4 RSI: 00000000000001c4 RDI: 0000000000000000 > > [ 0.988792] RBP: ffffc9000012fd60 R08: 0000000000000040 R09: 0000000000000010 > > [ 0.992761] R10: ffff88800daa1eb0 R11: fffffffffff8dc6f R12: 0000000040000163 > > [ 0.992761] R13: ffffc9000012fd60 R14: 0000000000000000 R15: ffff88807d62fc90 > > [ 0.992761] FS: 0000000000000000(0000) GS:ffff88807d600000(0000) knlGS:0000000000000000 > > [ 0.996772] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 0.996772] CR2: 00000000000000c4 CR3: 0000000002c1a000 CR4: 00000000003506f0 > > [ 0.996772] note: cpuhp/0[20] exited with irqs disabled > > [ 1.680874] tsc: Refined TSC clocksource calibration: 3700.028 MHz > > [ 1.683128] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x6aaae08e541, max_idle_ns: 881590514464 ns > > [ 1.688137] clocksource: Switched to clocksource tsc > > > > > > Can you please share the guest parameters? > I was able to reproduce it. The patch below seems to fix the issue. There's a comment in the function that this code is not for virtualized environments. Also, the "L3 in Northbridge" design doesn't apply to Zen systems. I'll keep looking at this to get a better understanding. My first thought is that this was silently handled before, because the AMD_NB code operated on PCI IDs. And these wouldn't be exposed to guests, so the northbridge data structures wouldn't be initialized. Specifically, I think we now have a non-zero number of northbridges, since using the topology info rather than counting PCI devices. In any case, I think it's better to have explicit checks. Thanks, Yazen diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c index 392d09c936d6..93d993a6a1df 100644 --- a/arch/x86/kernel/cpu/cacheinfo.c +++ b/arch/x86/kernel/cpu/cacheinfo.c @@ -595,6 +595,12 @@ static void amd_init_l3_cache(struct _cpuid4_info_regs *this_leaf, int index) if (index < 3) return; + if (cpu_feature_enabled(X86_FEATURE_HYPERVISOR)) + return; + + if (cpu_feature_enabled(X86_FEATURE_ZEN)) + return; + node = topology_amd_node_id(smp_processor_id()); this_leaf->nb = node_to_amd_nb(node); if (this_leaf->nb && !this_leaf->nb->l3_cache.indices)