Message ID | 20240113213347.9562-1-pchelkin@ispras.ru (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/ttm: fix ttm pool initialization for no-dma-device drivers | expand |
Am 13.01.24 um 22:33 schrieb Fedor Pchelkin: > QXL driver doesn't use any device for DMA mappings or allocations so > dev_to_node() will panic inside ttm_device_init() on NUMA systems: > > general protection fault, probably for non-canonical address 0xdffffc000000007a: 0000 [#1] PREEMPT SMP KASAN NOPTI > KASAN: null-ptr-deref in range [0x00000000000003d0-0x00000000000003d7] > CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.7.0+ #9 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 > RIP: 0010:ttm_device_init+0x10e/0x340 > Call Trace: > <TASK> > qxl_ttm_init+0xaa/0x310 > qxl_device_init+0x1071/0x2000 > qxl_pci_probe+0x167/0x3f0 > local_pci_probe+0xe1/0x1b0 > pci_device_probe+0x29d/0x790 > really_probe+0x251/0x910 > __driver_probe_device+0x1ea/0x390 > driver_probe_device+0x4e/0x2e0 > __driver_attach+0x1e3/0x600 > bus_for_each_dev+0x12d/0x1c0 > bus_add_driver+0x25a/0x590 > driver_register+0x15c/0x4b0 > qxl_pci_driver_init+0x67/0x80 > do_one_initcall+0xf5/0x5d0 > kernel_init_freeable+0x637/0xb10 > kernel_init+0x1c/0x2e0 > ret_from_fork+0x48/0x80 > ret_from_fork_asm+0x1b/0x30 > </TASK> > Modules linked in: > ---[ end trace 0000000000000000 ]--- > RIP: 0010:ttm_device_init+0x10e/0x340 > > Fall back to NUMA_NO_NODE if there is no device for DMA. > > Found by Linux Verification Center (linuxtesting.org). > > Fixes: b0a7ce53d494 ("drm/ttm: Schedule delayed_delete worker closer") > Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Oh, thanks for that fix. Reviewed-by: Christian König <christian.koenig@amd.com> Going to push that into -fixes in a minute. Regards, Christian. > --- > drivers/gpu/drm/ttm/ttm_device.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c > index f5187b384ae9..4130945052ed 100644 > --- a/drivers/gpu/drm/ttm/ttm_device.c > +++ b/drivers/gpu/drm/ttm/ttm_device.c > @@ -195,7 +195,7 @@ int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *func > bool use_dma_alloc, bool use_dma32) > { > struct ttm_global *glob = &ttm_glob; > - int ret; > + int ret, nid; > > if (WARN_ON(vma_manager == NULL)) > return -EINVAL; > @@ -215,7 +215,12 @@ int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *func > > ttm_sys_man_init(bdev); > > - ttm_pool_init(&bdev->pool, dev, dev_to_node(dev), use_dma_alloc, use_dma32); > + if (dev) > + nid = dev_to_node(dev); > + else > + nid = NUMA_NO_NODE; > + > + ttm_pool_init(&bdev->pool, dev, nid, use_dma_alloc, use_dma32); > > bdev->vma_manager = vma_manager; > spin_lock_init(&bdev->lru_lock);
diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c index f5187b384ae9..4130945052ed 100644 --- a/drivers/gpu/drm/ttm/ttm_device.c +++ b/drivers/gpu/drm/ttm/ttm_device.c @@ -195,7 +195,7 @@ int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *func bool use_dma_alloc, bool use_dma32) { struct ttm_global *glob = &ttm_glob; - int ret; + int ret, nid; if (WARN_ON(vma_manager == NULL)) return -EINVAL; @@ -215,7 +215,12 @@ int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *func ttm_sys_man_init(bdev); - ttm_pool_init(&bdev->pool, dev, dev_to_node(dev), use_dma_alloc, use_dma32); + if (dev) + nid = dev_to_node(dev); + else + nid = NUMA_NO_NODE; + + ttm_pool_init(&bdev->pool, dev, nid, use_dma_alloc, use_dma32); bdev->vma_manager = vma_manager; spin_lock_init(&bdev->lru_lock);
QXL driver doesn't use any device for DMA mappings or allocations so dev_to_node() will panic inside ttm_device_init() on NUMA systems: general protection fault, probably for non-canonical address 0xdffffc000000007a: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x00000000000003d0-0x00000000000003d7] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.7.0+ #9 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 RIP: 0010:ttm_device_init+0x10e/0x340 Call Trace: <TASK> qxl_ttm_init+0xaa/0x310 qxl_device_init+0x1071/0x2000 qxl_pci_probe+0x167/0x3f0 local_pci_probe+0xe1/0x1b0 pci_device_probe+0x29d/0x790 really_probe+0x251/0x910 __driver_probe_device+0x1ea/0x390 driver_probe_device+0x4e/0x2e0 __driver_attach+0x1e3/0x600 bus_for_each_dev+0x12d/0x1c0 bus_add_driver+0x25a/0x590 driver_register+0x15c/0x4b0 qxl_pci_driver_init+0x67/0x80 do_one_initcall+0xf5/0x5d0 kernel_init_freeable+0x637/0xb10 kernel_init+0x1c/0x2e0 ret_from_fork+0x48/0x80 ret_from_fork_asm+0x1b/0x30 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- RIP: 0010:ttm_device_init+0x10e/0x340 Fall back to NUMA_NO_NODE if there is no device for DMA. Found by Linux Verification Center (linuxtesting.org). Fixes: b0a7ce53d494 ("drm/ttm: Schedule delayed_delete worker closer") Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> --- drivers/gpu/drm/ttm/ttm_device.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)