Message ID | 20240919163713.3126-1-zachwade.k@gmail.com (mailing list archive) |
---|---|
State | Changes Requested, archived |
Headers | show |
Series | [v2] platform/x86: ISST: Fix the KASAN report slab-out-of-bounds bug | expand |
On Fri, 2024-09-20 at 00:37 +0800, Zach Wade wrote: > Attaching SST PCI device to VM causes You are not attaching SST PCI device to VM. It seems some hard drives emulates same PCI vendor/device ID. But replacing with topology_logical_package_id() is fine. Let's find out what are those devices. Thanks, Srinivas > "BUG: KASAN: slab-out-of-bounds". > kasan report: > [ 19.411889] > ================================================================== > [ 19.413702] BUG: KASAN: slab-out-of-bounds in > _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] > [ 19.415634] Read of size 8 at addr ffff888829e65200 by task > cpuhp/16/113 > [ 19.417368] > [ 19.418627] CPU: 16 PID: 113 Comm: cpuhp/16 Tainted: G > E 6.9.0 #10 > [ 19.420435] Hardware name: VMware, Inc. VMware20,1/440BX Desktop > Reference Platform, BIOS VMW201.00V.20192059.B64.2207280713 > 07/28/2022 > [ 19.422687] Call Trace: > [ 19.424091] <TASK> > [ 19.425448] dump_stack_lvl+0x5d/0x80 > [ 19.426963] ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] > [ 19.428694] print_report+0x19d/0x52e > [ 19.430206] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 > [ 19.431837] ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] > [ 19.433539] kasan_report+0xf0/0x170 > [ 19.435019] ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] > [ 19.436709] _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] > [ 19.438379] ? __pfx_sched_clock_cpu+0x10/0x10 > [ 19.439910] isst_if_cpu_online+0x406/0x58f [isst_if_common] > [ 19.441573] ? __pfx_isst_if_cpu_online+0x10/0x10 [isst_if_common] > [ 19.443263] ? ttwu_queue_wakelist+0x2c1/0x360 > [ 19.444797] cpuhp_invoke_callback+0x221/0xec0 > [ 19.446337] cpuhp_thread_fun+0x21b/0x610 > [ 19.447814] ? __pfx_cpuhp_thread_fun+0x10/0x10 > [ 19.449354] smpboot_thread_fn+0x2e7/0x6e0 > [ 19.450859] ? __pfx_smpboot_thread_fn+0x10/0x10 > [ 19.452405] kthread+0x29c/0x350 > [ 19.453817] ? __pfx_kthread+0x10/0x10 > [ 19.455253] ret_from_fork+0x31/0x70 > [ 19.456685] ? __pfx_kthread+0x10/0x10 > [ 19.458114] ret_from_fork_asm+0x1a/0x30 > [ 19.459573] </TASK> > [ 19.460853] > [ 19.462055] Allocated by task 1198: > [ 19.463410] kasan_save_stack+0x30/0x50 > [ 19.464788] kasan_save_track+0x14/0x30 > [ 19.466139] __kasan_kmalloc+0xaa/0xb0 > [ 19.467465] __kmalloc+0x1cd/0x470 > [ 19.468748] isst_if_cdev_register+0x1da/0x350 [isst_if_common] > [ 19.470233] isst_if_mbox_init+0x108/0xff0 [isst_if_mbox_msr] > [ 19.471670] do_one_initcall+0xa4/0x380 > [ 19.472903] do_init_module+0x238/0x760 > [ 19.474105] load_module+0x5239/0x6f00 > [ 19.475285] init_module_from_file+0xd1/0x130 > [ 19.476506] idempotent_init_module+0x23b/0x650 > [ 19.477725] __x64_sys_finit_module+0xbe/0x130 > [ 19.476506] idempotent_init_module+0x23b/0x650 > [ 19.477725] __x64_sys_finit_module+0xbe/0x130 > [ 19.478920] do_syscall_64+0x82/0x160 > [ 19.480036] entry_SYSCALL_64_after_hwframe+0x76/0x7e > [ 19.481292] > [ 19.482205] The buggy address belongs to the object at > ffff888829e65000 > which belongs to the cache kmalloc-512 of size 512 > [ 19.484818] The buggy address is located 0 bytes to the right of > allocated 512-byte region [ffff888829e65000, ffff888829e65200) > [ 19.487447] > [ 19.488328] The buggy address belongs to the physical page: > [ 19.489569] page: refcount:1 mapcount:0 mapping:0000000000000000 > index:0xffff888829e60c00 pfn:0x829e60 > [ 19.491140] head: order:3 entire_mapcount:0 nr_pages_mapped:0 > pincount:0 > [ 19.492466] anon flags: > 0x57ffffc0000840(slab|head|node=1|zone=2|lastcpupid=0x1fffff) > [ 19.493914] page_type: 0xffffffff() > [ 19.494988] raw: 0057ffffc0000840 ffff88810004cc80 > 0000000000000000 0000000000000001 > [ 19.496451] raw: ffff888829e60c00 0000000080200018 > 00000001ffffffff 0000000000000000 > [ 19.497906] head: 0057ffffc0000840 ffff88810004cc80 > 0000000000000000 0000000000000001 > [ 19.499379] head: ffff888829e60c00 0000000080200018 > 00000001ffffffff 0000000000000000 > [ 19.500844] head: 0057ffffc0000003 ffffea0020a79801 > ffffea0020a79848 00000000ffffffff > [ 19.502316] head: 0000000800000000 0000000000000000 > 00000000ffffffff 0000000000000000 > [ 19.503784] page dumped because: kasan: bad access detected > [ 19.505058] > [ 19.505970] Memory state around the buggy address: > [ 19.507172] ffff888829e65100: 00 00 00 00 00 00 00 00 00 00 00 00 > 00 00 00 00 > [ 19.508599] ffff888829e65180: 00 00 00 00 00 00 00 00 00 00 00 00 > 00 00 00 00 > [ 19.510013] >ffff888829e65200: fc fc fc fc fc fc fc fc fc fc fc fc > fc fc fc fc > [ 19.510014] ^ > [ 19.510016] ffff888829e65280: fc fc fc fc fc fc fc fc fc fc fc fc > fc fc fc fc > [ 19.510018] ffff888829e65300: fc fc fc fc fc fc fc fc fc fc fc fc > fc fc fc fc > [ 19.515367] > ================================================================== > The reason for this error is physical_package_ids assigned by VMM > have > holes. This will cause value returned by > topology_physical_package_id() > to be more than topology_max_packages(). The allocation uses > topology_max_packages() to allocate memory. topology_max_packages() > returns maximum logical package IDs. Hence use > topology_logical_package_id() instead of > topology_physical_package_id(). > > Fixes: 9a1aac8a96dc ("platform/x86: ISST: PUNIT device mapping with > Sub-NUMA clustering") > Signed-off-by: Zach Wade <zachwade.k@gmail.com> > --- > drivers/platform/x86/intel/speed_select_if/isst_if_common.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git > a/drivers/platform/x86/intel/speed_select_if/isst_if_common.c > b/drivers/platform/x86/intel/speed_select_if/isst_if_common.c > index 10e21563fa46..030c33070b84 100644 > --- a/drivers/platform/x86/intel/speed_select_if/isst_if_common.c > +++ b/drivers/platform/x86/intel/speed_select_if/isst_if_common.c > @@ -316,7 +316,9 @@ static struct pci_dev *_isst_if_get_pci_dev(int > cpu, int bus_no, int dev, int fn > cpu >= nr_cpu_ids || cpu >= num_possible_cpus()) > return NULL; > > - pkg_id = topology_physical_package_id(cpu); > + pkg_id = topology_logical_package_id(cpu); > + if (pkg_id >= topology_max_packages()) > + return NULL; > > bus_number = isst_cpu_info[cpu].bus_info[bus_no]; > if (bus_number < 0)
On 2024/9/20 2:44, srinivas pandruvada wrote: > On Fri, 2024-09-20 at 00:37 +0800, Zach Wade wrote: >> Attaching SST PCI device to VM causes > You are not attaching SST PCI device to VM. It seems some hard drives > emulates same PCI vendor/device ID. > > But replacing with topology_logical_package_id() is fine. > > Let's find out what are those devices. > > Thanks, > Srinivas > So should we delete this description? Do I need to modify the patch again? Thanks, Zach >> "BUG: KASAN: slab-out-of-bounds". >> kasan report: >> [ 19.411889] >> ================================================================== >> [ 19.413702] BUG: KASAN: slab-out-of-bounds in >> _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] >> [ 19.415634] Read of size 8 at addr ffff888829e65200 by task >> cpuhp/16/113 >> [ 19.417368] >> [ 19.418627] CPU: 16 PID: 113 Comm: cpuhp/16 Tainted: G >> E 6.9.0 #10 >> [ 19.420435] Hardware name: VMware, Inc. VMware20,1/440BX Desktop >> Reference Platform, BIOS VMW201.00V.20192059.B64.2207280713 >> 07/28/2022 >> [ 19.422687] Call Trace: >> [ 19.424091] <TASK> >> [ 19.425448] dump_stack_lvl+0x5d/0x80 >> [ 19.426963] ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] >> [ 19.428694] print_report+0x19d/0x52e >> [ 19.430206] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 >> [ 19.431837] ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] >> [ 19.433539] kasan_report+0xf0/0x170 >> [ 19.435019] ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] >> [ 19.436709] _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] >> [ 19.438379] ? __pfx_sched_clock_cpu+0x10/0x10 >> [ 19.439910] isst_if_cpu_online+0x406/0x58f [isst_if_common] >> [ 19.441573] ? __pfx_isst_if_cpu_online+0x10/0x10 [isst_if_common] >> [ 19.443263] ? ttwu_queue_wakelist+0x2c1/0x360 >> [ 19.444797] cpuhp_invoke_callback+0x221/0xec0 >> [ 19.446337] cpuhp_thread_fun+0x21b/0x610 >> [ 19.447814] ? __pfx_cpuhp_thread_fun+0x10/0x10 >> [ 19.449354] smpboot_thread_fn+0x2e7/0x6e0 >> [ 19.450859] ? __pfx_smpboot_thread_fn+0x10/0x10 >> [ 19.452405] kthread+0x29c/0x350 >> [ 19.453817] ? __pfx_kthread+0x10/0x10 >> [ 19.455253] ret_from_fork+0x31/0x70 >> [ 19.456685] ? __pfx_kthread+0x10/0x10 >> [ 19.458114] ret_from_fork_asm+0x1a/0x30 >> [ 19.459573] </TASK> >> [ 19.460853] >> [ 19.462055] Allocated by task 1198: >> [ 19.463410] kasan_save_stack+0x30/0x50 >> [ 19.464788] kasan_save_track+0x14/0x30 >> [ 19.466139] __kasan_kmalloc+0xaa/0xb0 >> [ 19.467465] __kmalloc+0x1cd/0x470 >> [ 19.468748] isst_if_cdev_register+0x1da/0x350 [isst_if_common] >> [ 19.470233] isst_if_mbox_init+0x108/0xff0 [isst_if_mbox_msr] >> [ 19.471670] do_one_initcall+0xa4/0x380 >> [ 19.472903] do_init_module+0x238/0x760 >> [ 19.474105] load_module+0x5239/0x6f00 >> [ 19.475285] init_module_from_file+0xd1/0x130 >> [ 19.476506] idempotent_init_module+0x23b/0x650 >> [ 19.477725] __x64_sys_finit_module+0xbe/0x130 >> [ 19.476506] idempotent_init_module+0x23b/0x650 >> [ 19.477725] __x64_sys_finit_module+0xbe/0x130 >> [ 19.478920] do_syscall_64+0x82/0x160 >> [ 19.480036] entry_SYSCALL_64_after_hwframe+0x76/0x7e >> [ 19.481292] >> [ 19.482205] The buggy address belongs to the object at >> ffff888829e65000 >> which belongs to the cache kmalloc-512 of size 512 >> [ 19.484818] The buggy address is located 0 bytes to the right of >> allocated 512-byte region [ffff888829e65000, ffff888829e65200) >> [ 19.487447] >> [ 19.488328] The buggy address belongs to the physical page: >> [ 19.489569] page: refcount:1 mapcount:0 mapping:0000000000000000 >> index:0xffff888829e60c00 pfn:0x829e60 >> [ 19.491140] head: order:3 entire_mapcount:0 nr_pages_mapped:0 >> pincount:0 >> [ 19.492466] anon flags: >> 0x57ffffc0000840(slab|head|node=1|zone=2|lastcpupid=0x1fffff) >> [ 19.493914] page_type: 0xffffffff() >> [ 19.494988] raw: 0057ffffc0000840 ffff88810004cc80 >> 0000000000000000 0000000000000001 >> [ 19.496451] raw: ffff888829e60c00 0000000080200018 >> 00000001ffffffff 0000000000000000 >> [ 19.497906] head: 0057ffffc0000840 ffff88810004cc80 >> 0000000000000000 0000000000000001 >> [ 19.499379] head: ffff888829e60c00 0000000080200018 >> 00000001ffffffff 0000000000000000 >> [ 19.500844] head: 0057ffffc0000003 ffffea0020a79801 >> ffffea0020a79848 00000000ffffffff >> [ 19.502316] head: 0000000800000000 0000000000000000 >> 00000000ffffffff 0000000000000000 >> [ 19.503784] page dumped because: kasan: bad access detected >> [ 19.505058] >> [ 19.505970] Memory state around the buggy address: >> [ 19.507172] ffff888829e65100: 00 00 00 00 00 00 00 00 00 00 00 00 >> 00 00 00 00 >> [ 19.508599] ffff888829e65180: 00 00 00 00 00 00 00 00 00 00 00 00 >> 00 00 00 00 >> [ 19.510013] >ffff888829e65200: fc fc fc fc fc fc fc fc fc fc fc fc >> fc fc fc fc >> [ 19.510014] ^ >> [ 19.510016] ffff888829e65280: fc fc fc fc fc fc fc fc fc fc fc fc >> fc fc fc fc >> [ 19.510018] ffff888829e65300: fc fc fc fc fc fc fc fc fc fc fc fc >> fc fc fc fc >> [ 19.515367] >> ================================================================== >> The reason for this error is physical_package_ids assigned by VMM >> have >> holes. This will cause value returned by >> topology_physical_package_id() >> to be more than topology_max_packages(). The allocation uses >> topology_max_packages() to allocate memory. topology_max_packages() >> returns maximum logical package IDs. Hence use >> topology_logical_package_id() instead of >> topology_physical_package_id(). >> >> Fixes: 9a1aac8a96dc ("platform/x86: ISST: PUNIT device mapping with >> Sub-NUMA clustering") >> Signed-off-by: Zach Wade <zachwade.k@gmail.com> >> --- >> drivers/platform/x86/intel/speed_select_if/isst_if_common.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git >> a/drivers/platform/x86/intel/speed_select_if/isst_if_common.c >> b/drivers/platform/x86/intel/speed_select_if/isst_if_common.c >> index 10e21563fa46..030c33070b84 100644 >> --- a/drivers/platform/x86/intel/speed_select_if/isst_if_common.c >> +++ b/drivers/platform/x86/intel/speed_select_if/isst_if_common.c >> @@ -316,7 +316,9 @@ static struct pci_dev *_isst_if_get_pci_dev(int >> cpu, int bus_no, int dev, int fn >> cpu >= nr_cpu_ids || cpu >= num_possible_cpus()) >> return NULL; >> >> - pkg_id = topology_physical_package_id(cpu); >> + pkg_id = topology_logical_package_id(cpu); >> + if (pkg_id >= topology_max_packages()) >> + return NULL; >> >> bus_number = isst_cpu_info[cpu].bus_info[bus_no]; >> if (bus_number < 0) >
On Sat, 2024-09-21 at 00:19 +0800, Zach Wade wrote: > > > On 2024/9/20 2:44, srinivas pandruvada wrote: > > On Fri, 2024-09-20 at 00:37 +0800, Zach Wade wrote: > > > Attaching SST PCI device to VM causes > > You are not attaching SST PCI device to VM. It seems some hard > > drives > > emulates same PCI vendor/device ID. > > > > But replacing with topology_logical_package_id() is fine. > > > > Let's find out what are those devices. > > > > Thanks, > > Srinivas > > > > So should we delete this description? Do I need to modify the patch > again? No need to remove that line. It doesn't matter how we arrive here. VMM can emulate any PCI device. Some suggestions below. > > Thanks, > Zach > > > > "BUG: KASAN: slab-out-of-bounds". > > > kasan report: > > > [ 19.411889] > > > ================================================================= > > > = > > > [ 19.413702] BUG: KASAN: slab-out-of-bounds in > > > _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] > > > [ 19.415634] Read of size 8 at addr ffff888829e65200 by task > > > cpuhp/16/113 > > > [ 19.417368] > > > [ 19.418627] CPU: 16 PID: 113 Comm: cpuhp/16 Tainted: G > > > E 6.9.0 #10 > > > [ 19.420435] Hardware name: VMware, Inc. VMware20,1/440BX > > > Desktop > > > Reference Platform, BIOS VMW201.00V.20192059.B64.2207280713 > > > 07/28/2022 > > > [ 19.422687] Call Trace: > > > [ 19.424091] <TASK> > > > [ 19.425448] dump_stack_lvl+0x5d/0x80 > > > [ 19.426963] ? _isst_if_get_pci_dev+0x3d5/0x400 > > > [isst_if_common] > > > [ 19.428694] print_report+0x19d/0x52e > > > [ 19.430206] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 > > > [ 19.431837] ? _isst_if_get_pci_dev+0x3d5/0x400 > > > [isst_if_common] > > > [ 19.433539] kasan_report+0xf0/0x170 > > > [ 19.435019] ? _isst_if_get_pci_dev+0x3d5/0x400 > > > [isst_if_common] > > > [ 19.436709] _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] > > > [ 19.438379] ? __pfx_sched_clock_cpu+0x10/0x10 > > > [ 19.439910] isst_if_cpu_online+0x406/0x58f [isst_if_common] > > > [ 19.441573] ? __pfx_isst_if_cpu_online+0x10/0x10 > > > [isst_if_common] > > > [ 19.443263] ? ttwu_queue_wakelist+0x2c1/0x360 > > > [ 19.444797] cpuhp_invoke_callback+0x221/0xec0 > > > [ 19.446337] cpuhp_thread_fun+0x21b/0x610 > > > [ 19.447814] ? __pfx_cpuhp_thread_fun+0x10/0x10 > > > [ 19.449354] smpboot_thread_fn+0x2e7/0x6e0 > > > [ 19.450859] ? __pfx_smpboot_thread_fn+0x10/0x10 > > > [ 19.452405] kthread+0x29c/0x350 > > > [ 19.453817] ? __pfx_kthread+0x10/0x10 > > > [ 19.455253] ret_from_fork+0x31/0x70 > > > [ 19.456685] ? __pfx_kthread+0x10/0x10 > > > [ 19.458114] ret_from_fork_asm+0x1a/0x30 > > > [ 19.459573] </TASK> > > > [ 19.460853] > > > [ 19.462055] Allocated by task 1198: > > > [ 19.463410] kasan_save_stack+0x30/0x50 > > > [ 19.464788] kasan_save_track+0x14/0x30 > > > [ 19.466139] __kasan_kmalloc+0xaa/0xb0 > > > [ 19.467465] __kmalloc+0x1cd/0x470 > > > [ 19.468748] isst_if_cdev_register+0x1da/0x350 > > > [isst_if_common] > > > [ 19.470233] isst_if_mbox_init+0x108/0xff0 [isst_if_mbox_msr] > > > [ 19.471670] do_one_initcall+0xa4/0x380 > > > [ 19.472903] do_init_module+0x238/0x760 > > > [ 19.474105] load_module+0x5239/0x6f00 > > > [ 19.475285] init_module_from_file+0xd1/0x130 > > > [ 19.476506] idempotent_init_module+0x23b/0x650 > > > [ 19.477725] __x64_sys_finit_module+0xbe/0x130 > > > [ 19.476506] idempotent_init_module+0x23b/0x650 > > > [ 19.477725] __x64_sys_finit_module+0xbe/0x130 > > > [ 19.478920] do_syscall_64+0x82/0x160 > > > [ 19.480036] entry_SYSCALL_64_after_hwframe+0x76/0x7e > > > [ 19.481292] > > > [ 19.482205] The buggy address belongs to the object at > > > ffff888829e65000 > > > which belongs to the cache kmalloc-512 of size 512 > > > [ 19.484818] The buggy address is located 0 bytes to the right > > > of > > > allocated 512-byte region [ffff888829e65000, ffff888829e65200) > > > [ 19.487447] > > > [ 19.488328] The buggy address belongs to the physical page: > > > [ 19.489569] page: refcount:1 mapcount:0 > > > mapping:0000000000000000 > > > index:0xffff888829e60c00 pfn:0x829e60 > > > [ 19.491140] head: order:3 entire_mapcount:0 nr_pages_mapped:0 > > > pincount:0 > > > [ 19.492466] anon flags: > > > 0x57ffffc0000840(slab|head|node=1|zone=2|lastcpupid=0x1fffff) > > > [ 19.493914] page_type: 0xffffffff() > > > [ 19.494988] raw: 0057ffffc0000840 ffff88810004cc80 > > > 0000000000000000 0000000000000001 > > > [ 19.496451] raw: ffff888829e60c00 0000000080200018 > > > 00000001ffffffff 0000000000000000 > > > [ 19.497906] head: 0057ffffc0000840 ffff88810004cc80 > > > 0000000000000000 0000000000000001 > > > [ 19.499379] head: ffff888829e60c00 0000000080200018 > > > 00000001ffffffff 0000000000000000 > > > [ 19.500844] head: 0057ffffc0000003 ffffea0020a79801 > > > ffffea0020a79848 00000000ffffffff > > > [ 19.502316] head: 0000000800000000 0000000000000000 > > > 00000000ffffffff 0000000000000000 > > > [ 19.503784] page dumped because: kasan: bad access detected > > > [ 19.505058] > > > [ 19.505970] Memory state around the buggy address: > > > [ 19.507172] ffff888829e65100: 00 00 00 00 00 00 00 00 00 00 > > > 00 00 > > > 00 00 00 00 > > > [ 19.508599] ffff888829e65180: 00 00 00 00 00 00 00 00 00 00 > > > 00 00 > > > 00 00 00 00 > > > [ 19.510013] >ffff888829e65200: fc fc fc fc fc fc fc fc fc fc > > > fc fc > > > fc fc fc fc > > > [ 19.510014] ^ > > > [ 19.510016] ffff888829e65280: fc fc fc fc fc fc fc fc fc fc > > > fc fc > > > fc fc fc fc > > > [ 19.510018] ffff888829e65300: fc fc fc fc fc fc fc fc fc fc > > > fc fc > > > fc fc fc fc > > > [ 19.515367] > > > ================================================================= > > > = A new line here " The reason for this error is physical_package_ids assigned by VMware VMM are not continuous and have gaps. This will cause value returned by topology_physical_package_id() to be more than topology_max_packages(). Here the allocation uses topology_max_packages(). The call to topology_max_packages() returns maximum logical package ID not physical ID. Hence use topology_logical_package_id() instead of topology_physical_package_id(). " My copy paste formatting may not be correct to run with ./scripts/checkpatch.pl > > > The reason for this error is physical_package_ids assigned by VMM > > > have > > > holes. This will cause value returned by > > > topology_physical_package_id() > > > to be more than topology_max_packages(). The allocation uses > > > topology_max_packages() to allocate memory. > > > topology_max_packages() > > > returns maximum logical package IDs. Hence use > > > topology_logical_package_id() instead of > > > topology_physical_package_id(). > > > > > > Fixes: 9a1aac8a96dc ("platform/x86: ISST: PUNIT device mapping > > > with > > > Sub-NUMA clustering") > > > Signed-off-by: Zach Wade <zachwade.k@gmail.com> What is the kernel version of your kernel? Cc: <stable@vger.kernel.org> Thanks, Srinivas > > > --- > > > drivers/platform/x86/intel/speed_select_if/isst_if_common.c | 4 > > > +++- > > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > > > diff --git > > > a/drivers/platform/x86/intel/speed_select_if/isst_if_common.c > > > b/drivers/platform/x86/intel/speed_select_if/isst_if_common.c > > > index 10e21563fa46..030c33070b84 100644 > > > --- a/drivers/platform/x86/intel/speed_select_if/isst_if_common.c > > > +++ b/drivers/platform/x86/intel/speed_select_if/isst_if_common.c > > > @@ -316,7 +316,9 @@ static struct pci_dev > > > *_isst_if_get_pci_dev(int > > > cpu, int bus_no, int dev, int fn > > > cpu >= nr_cpu_ids || cpu >= num_possible_cpus()) > > > return NULL; > > > > > > - pkg_id = topology_physical_package_id(cpu); > > > + pkg_id = topology_logical_package_id(cpu); > > > + if (pkg_id >= topology_max_packages()) > > > + return NULL; > > > > > > bus_number = isst_cpu_info[cpu].bus_info[bus_no]; > > > if (bus_number < 0) > > >
On 2024/9/21 3:37, srinivas pandruvada wrote: > On Sat, 2024-09-21 at 00:19 +0800, Zach Wade wrote: >> >> >> On 2024/9/20 2:44, srinivas pandruvada wrote: >>> On Fri, 2024-09-20 at 00:37 +0800, Zach Wade wrote: >>>> Attaching SST PCI device to VM causes >>> You are not attaching SST PCI device to VM. It seems some hard >>> drives >>> emulates same PCI vendor/device ID. >>> >>> But replacing with topology_logical_package_id() is fine. >>> >>> Let's find out what are those devices. >>> >>> Thanks, >>> Srinivas >>> >> >> So should we delete this description? Do I need to modify the patch >> again? > > No need to remove that line. It doesn't matter how we arrive here. VMM > can emulate any PCI device. > OK, I won't change this next time I send it. > Some suggestions below. > >> >> Thanks, >> Zach >> >>>> "BUG: KASAN: slab-out-of-bounds". >>>> kasan report: >>>> [ 19.411889] >>>> ================================================================= >>>> = >>>> [ 19.413702] BUG: KASAN: slab-out-of-bounds in >>>> _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] >>>> [ 19.415634] Read of size 8 at addr ffff888829e65200 by task >>>> cpuhp/16/113 >>>> [ 19.417368] >>>> [ 19.418627] CPU: 16 PID: 113 Comm: cpuhp/16 Tainted: G >>>> E 6.9.0 #10 >>>> [ 19.420435] Hardware name: VMware, Inc. VMware20,1/440BX >>>> Desktop >>>> Reference Platform, BIOS VMW201.00V.20192059.B64.2207280713 >>>> 07/28/2022 >>>> [ 19.422687] Call Trace: >>>> [ 19.424091] <TASK> >>>> [ 19.425448] dump_stack_lvl+0x5d/0x80 >>>> [ 19.426963] ? _isst_if_get_pci_dev+0x3d5/0x400 >>>> [isst_if_common] >>>> [ 19.428694] print_report+0x19d/0x52e >>>> [ 19.430206] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 >>>> [ 19.431837] ? _isst_if_get_pci_dev+0x3d5/0x400 >>>> [isst_if_common] >>>> [ 19.433539] kasan_report+0xf0/0x170 >>>> [ 19.435019] ? _isst_if_get_pci_dev+0x3d5/0x400 >>>> [isst_if_common] >>>> [ 19.436709] _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] >>>> [ 19.438379] ? __pfx_sched_clock_cpu+0x10/0x10 >>>> [ 19.439910] isst_if_cpu_online+0x406/0x58f [isst_if_common] >>>> [ 19.441573] ? __pfx_isst_if_cpu_online+0x10/0x10 >>>> [isst_if_common] >>>> [ 19.443263] ? ttwu_queue_wakelist+0x2c1/0x360 >>>> [ 19.444797] cpuhp_invoke_callback+0x221/0xec0 >>>> [ 19.446337] cpuhp_thread_fun+0x21b/0x610 >>>> [ 19.447814] ? __pfx_cpuhp_thread_fun+0x10/0x10 >>>> [ 19.449354] smpboot_thread_fn+0x2e7/0x6e0 >>>> [ 19.450859] ? __pfx_smpboot_thread_fn+0x10/0x10 >>>> [ 19.452405] kthread+0x29c/0x350 >>>> [ 19.453817] ? __pfx_kthread+0x10/0x10 >>>> [ 19.455253] ret_from_fork+0x31/0x70 >>>> [ 19.456685] ? __pfx_kthread+0x10/0x10 >>>> [ 19.458114] ret_from_fork_asm+0x1a/0x30 >>>> [ 19.459573] </TASK> >>>> [ 19.460853] >>>> [ 19.462055] Allocated by task 1198: >>>> [ 19.463410] kasan_save_stack+0x30/0x50 >>>> [ 19.464788] kasan_save_track+0x14/0x30 >>>> [ 19.466139] __kasan_kmalloc+0xaa/0xb0 >>>> [ 19.467465] __kmalloc+0x1cd/0x470 >>>> [ 19.468748] isst_if_cdev_register+0x1da/0x350 >>>> [isst_if_common] >>>> [ 19.470233] isst_if_mbox_init+0x108/0xff0 [isst_if_mbox_msr] >>>> [ 19.471670] do_one_initcall+0xa4/0x380 >>>> [ 19.472903] do_init_module+0x238/0x760 >>>> [ 19.474105] load_module+0x5239/0x6f00 >>>> [ 19.475285] init_module_from_file+0xd1/0x130 >>>> [ 19.476506] idempotent_init_module+0x23b/0x650 >>>> [ 19.477725] __x64_sys_finit_module+0xbe/0x130 >>>> [ 19.476506] idempotent_init_module+0x23b/0x650 >>>> [ 19.477725] __x64_sys_finit_module+0xbe/0x130 >>>> [ 19.478920] do_syscall_64+0x82/0x160 >>>> [ 19.480036] entry_SYSCALL_64_after_hwframe+0x76/0x7e >>>> [ 19.481292] >>>> [ 19.482205] The buggy address belongs to the object at >>>> ffff888829e65000 >>>> which belongs to the cache kmalloc-512 of size 512 >>>> [ 19.484818] The buggy address is located 0 bytes to the right >>>> of >>>> allocated 512-byte region [ffff888829e65000, ffff888829e65200) >>>> [ 19.487447] >>>> [ 19.488328] The buggy address belongs to the physical page: >>>> [ 19.489569] page: refcount:1 mapcount:0 >>>> mapping:0000000000000000 >>>> index:0xffff888829e60c00 pfn:0x829e60 >>>> [ 19.491140] head: order:3 entire_mapcount:0 nr_pages_mapped:0 >>>> pincount:0 >>>> [ 19.492466] anon flags: >>>> 0x57ffffc0000840(slab|head|node=1|zone=2|lastcpupid=0x1fffff) >>>> [ 19.493914] page_type: 0xffffffff() >>>> [ 19.494988] raw: 0057ffffc0000840 ffff88810004cc80 >>>> 0000000000000000 0000000000000001 >>>> [ 19.496451] raw: ffff888829e60c00 0000000080200018 >>>> 00000001ffffffff 0000000000000000 >>>> [ 19.497906] head: 0057ffffc0000840 ffff88810004cc80 >>>> 0000000000000000 0000000000000001 >>>> [ 19.499379] head: ffff888829e60c00 0000000080200018 >>>> 00000001ffffffff 0000000000000000 >>>> [ 19.500844] head: 0057ffffc0000003 ffffea0020a79801 >>>> ffffea0020a79848 00000000ffffffff >>>> [ 19.502316] head: 0000000800000000 0000000000000000 >>>> 00000000ffffffff 0000000000000000 >>>> [ 19.503784] page dumped because: kasan: bad access detected >>>> [ 19.505058] >>>> [ 19.505970] Memory state around the buggy address: >>>> [ 19.507172] ffff888829e65100: 00 00 00 00 00 00 00 00 00 00 >>>> 00 00 >>>> 00 00 00 00 >>>> [ 19.508599] ffff888829e65180: 00 00 00 00 00 00 00 00 00 00 >>>> 00 00 >>>> 00 00 00 00 >>>> [ 19.510013] >ffff888829e65200: fc fc fc fc fc fc fc fc fc fc >>>> fc fc >>>> fc fc fc fc >>>> [ 19.510014] ^ >>>> [ 19.510016] ffff888829e65280: fc fc fc fc fc fc fc fc fc fc >>>> fc fc >>>> fc fc fc fc >>>> [ 19.510018] ffff888829e65300: fc fc fc fc fc fc fc fc fc fc >>>> fc fc >>>> fc fc fc fc >>>> [ 19.515367] >>>> ================================================================= >>>> = > > A new line here I see. > > > " > The reason for this error is physical_package_ids assigned by VMware > VMM > are not continuous and have gaps. This will cause value returned by > topology_physical_package_id() to be more than topology_max_packages(). > > Here the allocation uses topology_max_packages(). The call to > topology_max_packages() returns maximum logical package ID not physical > ID. Hence use topology_logical_package_id() instead of > topology_physical_package_id(). > " Ok, I'll add this description in v3. > > My copy paste formatting may not be correct to run with > ./scripts/checkpatch.pl > >>>> The reason for this error is physical_package_ids assigned by VMM >>>> have >>>> holes. This will cause value returned by >>>> topology_physical_package_id() >>>> to be more than topology_max_packages(). The allocation uses >>>> topology_max_packages() to allocate memory. >>>> topology_max_packages() >>>> returns maximum logical package IDs. Hence use >>>> topology_logical_package_id() instead of >>>> topology_physical_package_id(). >>>> >>>> Fixes: 9a1aac8a96dc ("platform/x86: ISST: PUNIT device mapping >>>> with >>>> Sub-NUMA clustering") >>>> Signed-off-by: Zach Wade <zachwade.k@gmail.com> > > What is the kernel version of your kernel? > Linux kernel master branch 6.9.0. Should I change the patch to a specific development branch? > Cc: <stable@vger.kernel.org> > OK, thanks.Next time I send it I will cc stable@vger.kernel.org Thanks, Zach > > Thanks, > Srinivas > >>>> --- >>>> drivers/platform/x86/intel/speed_select_if/isst_if_common.c | 4 >>>> +++- >>>> 1 file changed, 3 insertions(+), 1 deletion(-) >>>> >>>> diff --git >>>> a/drivers/platform/x86/intel/speed_select_if/isst_if_common.c >>>> b/drivers/platform/x86/intel/speed_select_if/isst_if_common.c >>>> index 10e21563fa46..030c33070b84 100644 >>>> --- a/drivers/platform/x86/intel/speed_select_if/isst_if_common.c >>>> +++ b/drivers/platform/x86/intel/speed_select_if/isst_if_common.c >>>> @@ -316,7 +316,9 @@ static struct pci_dev >>>> *_isst_if_get_pci_dev(int >>>> cpu, int bus_no, int dev, int fn >>>> cpu >= nr_cpu_ids || cpu >= num_possible_cpus()) >>>> return NULL; >>>> >>>> - pkg_id = topology_physical_package_id(cpu); >>>> + pkg_id = topology_logical_package_id(cpu); >>>> + if (pkg_id >= topology_max_packages()) >>>> + return NULL; >>>> >>>> bus_number = isst_cpu_info[cpu].bus_info[bus_no]; >>>> if (bus_number < 0) >>> >> >
diff --git a/drivers/platform/x86/intel/speed_select_if/isst_if_common.c b/drivers/platform/x86/intel/speed_select_if/isst_if_common.c index 10e21563fa46..030c33070b84 100644 --- a/drivers/platform/x86/intel/speed_select_if/isst_if_common.c +++ b/drivers/platform/x86/intel/speed_select_if/isst_if_common.c @@ -316,7 +316,9 @@ static struct pci_dev *_isst_if_get_pci_dev(int cpu, int bus_no, int dev, int fn cpu >= nr_cpu_ids || cpu >= num_possible_cpus()) return NULL; - pkg_id = topology_physical_package_id(cpu); + pkg_id = topology_logical_package_id(cpu); + if (pkg_id >= topology_max_packages()) + return NULL; bus_number = isst_cpu_info[cpu].bus_info[bus_no]; if (bus_number < 0)
Attaching SST PCI device to VM causes "BUG: KASAN: slab-out-of-bounds". kasan report: [ 19.411889] ================================================================== [ 19.413702] BUG: KASAN: slab-out-of-bounds in _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] [ 19.415634] Read of size 8 at addr ffff888829e65200 by task cpuhp/16/113 [ 19.417368] [ 19.418627] CPU: 16 PID: 113 Comm: cpuhp/16 Tainted: G E 6.9.0 #10 [ 19.420435] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.20192059.B64.2207280713 07/28/2022 [ 19.422687] Call Trace: [ 19.424091] <TASK> [ 19.425448] dump_stack_lvl+0x5d/0x80 [ 19.426963] ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] [ 19.428694] print_report+0x19d/0x52e [ 19.430206] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ 19.431837] ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] [ 19.433539] kasan_report+0xf0/0x170 [ 19.435019] ? _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] [ 19.436709] _isst_if_get_pci_dev+0x3d5/0x400 [isst_if_common] [ 19.438379] ? __pfx_sched_clock_cpu+0x10/0x10 [ 19.439910] isst_if_cpu_online+0x406/0x58f [isst_if_common] [ 19.441573] ? __pfx_isst_if_cpu_online+0x10/0x10 [isst_if_common] [ 19.443263] ? ttwu_queue_wakelist+0x2c1/0x360 [ 19.444797] cpuhp_invoke_callback+0x221/0xec0 [ 19.446337] cpuhp_thread_fun+0x21b/0x610 [ 19.447814] ? __pfx_cpuhp_thread_fun+0x10/0x10 [ 19.449354] smpboot_thread_fn+0x2e7/0x6e0 [ 19.450859] ? __pfx_smpboot_thread_fn+0x10/0x10 [ 19.452405] kthread+0x29c/0x350 [ 19.453817] ? __pfx_kthread+0x10/0x10 [ 19.455253] ret_from_fork+0x31/0x70 [ 19.456685] ? __pfx_kthread+0x10/0x10 [ 19.458114] ret_from_fork_asm+0x1a/0x30 [ 19.459573] </TASK> [ 19.460853] [ 19.462055] Allocated by task 1198: [ 19.463410] kasan_save_stack+0x30/0x50 [ 19.464788] kasan_save_track+0x14/0x30 [ 19.466139] __kasan_kmalloc+0xaa/0xb0 [ 19.467465] __kmalloc+0x1cd/0x470 [ 19.468748] isst_if_cdev_register+0x1da/0x350 [isst_if_common] [ 19.470233] isst_if_mbox_init+0x108/0xff0 [isst_if_mbox_msr] [ 19.471670] do_one_initcall+0xa4/0x380 [ 19.472903] do_init_module+0x238/0x760 [ 19.474105] load_module+0x5239/0x6f00 [ 19.475285] init_module_from_file+0xd1/0x130 [ 19.476506] idempotent_init_module+0x23b/0x650 [ 19.477725] __x64_sys_finit_module+0xbe/0x130 [ 19.476506] idempotent_init_module+0x23b/0x650 [ 19.477725] __x64_sys_finit_module+0xbe/0x130 [ 19.478920] do_syscall_64+0x82/0x160 [ 19.480036] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 19.481292] [ 19.482205] The buggy address belongs to the object at ffff888829e65000 which belongs to the cache kmalloc-512 of size 512 [ 19.484818] The buggy address is located 0 bytes to the right of allocated 512-byte region [ffff888829e65000, ffff888829e65200) [ 19.487447] [ 19.488328] The buggy address belongs to the physical page: [ 19.489569] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888829e60c00 pfn:0x829e60 [ 19.491140] head: order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ 19.492466] anon flags: 0x57ffffc0000840(slab|head|node=1|zone=2|lastcpupid=0x1fffff) [ 19.493914] page_type: 0xffffffff() [ 19.494988] raw: 0057ffffc0000840 ffff88810004cc80 0000000000000000 0000000000000001 [ 19.496451] raw: ffff888829e60c00 0000000080200018 00000001ffffffff 0000000000000000 [ 19.497906] head: 0057ffffc0000840 ffff88810004cc80 0000000000000000 0000000000000001 [ 19.499379] head: ffff888829e60c00 0000000080200018 00000001ffffffff 0000000000000000 [ 19.500844] head: 0057ffffc0000003 ffffea0020a79801 ffffea0020a79848 00000000ffffffff [ 19.502316] head: 0000000800000000 0000000000000000 00000000ffffffff 0000000000000000 [ 19.503784] page dumped because: kasan: bad access detected [ 19.505058] [ 19.505970] Memory state around the buggy address: [ 19.507172] ffff888829e65100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 19.508599] ffff888829e65180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 19.510013] >ffff888829e65200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 19.510014] ^ [ 19.510016] ffff888829e65280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 19.510018] ffff888829e65300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 19.515367] ================================================================== The reason for this error is physical_package_ids assigned by VMM have holes. This will cause value returned by topology_physical_package_id() to be more than topology_max_packages(). The allocation uses topology_max_packages() to allocate memory. topology_max_packages() returns maximum logical package IDs. Hence use topology_logical_package_id() instead of topology_physical_package_id(). Fixes: 9a1aac8a96dc ("platform/x86: ISST: PUNIT device mapping with Sub-NUMA clustering") Signed-off-by: Zach Wade <zachwade.k@gmail.com> --- drivers/platform/x86/intel/speed_select_if/isst_if_common.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)