diff mbox series

drm/ttm: fix ttm pool initialization for no-dma-device drivers

Message ID 20240113213347.9562-1-pchelkin@ispras.ru (mailing list archive)
State New, archived
Headers show
Series drm/ttm: fix ttm pool initialization for no-dma-device drivers | expand

Commit Message

Fedor Pchelkin Jan. 13, 2024, 9:33 p.m. UTC
QXL driver doesn't use any device for DMA mappings or allocations so
dev_to_node() will panic inside ttm_device_init() on NUMA systems:

general protection fault, probably for non-canonical address 0xdffffc000000007a: 0000 [#1] PREEMPT SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x00000000000003d0-0x00000000000003d7]
CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.7.0+ #9
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
RIP: 0010:ttm_device_init+0x10e/0x340
Call Trace:
 <TASK>
 qxl_ttm_init+0xaa/0x310
 qxl_device_init+0x1071/0x2000
 qxl_pci_probe+0x167/0x3f0
 local_pci_probe+0xe1/0x1b0
 pci_device_probe+0x29d/0x790
 really_probe+0x251/0x910
 __driver_probe_device+0x1ea/0x390
 driver_probe_device+0x4e/0x2e0
 __driver_attach+0x1e3/0x600
 bus_for_each_dev+0x12d/0x1c0
 bus_add_driver+0x25a/0x590
 driver_register+0x15c/0x4b0
 qxl_pci_driver_init+0x67/0x80
 do_one_initcall+0xf5/0x5d0
 kernel_init_freeable+0x637/0xb10
 kernel_init+0x1c/0x2e0
 ret_from_fork+0x48/0x80
 ret_from_fork_asm+0x1b/0x30
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:ttm_device_init+0x10e/0x340

Fall back to NUMA_NO_NODE if there is no device for DMA.

Found by Linux Verification Center (linuxtesting.org).

Fixes: b0a7ce53d494 ("drm/ttm: Schedule delayed_delete worker closer")
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
---
 drivers/gpu/drm/ttm/ttm_device.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

Christian König Jan. 15, 2024, 10:07 a.m. UTC | #1
Am 13.01.24 um 22:33 schrieb Fedor Pchelkin:
> QXL driver doesn't use any device for DMA mappings or allocations so
> dev_to_node() will panic inside ttm_device_init() on NUMA systems:
>
> general protection fault, probably for non-canonical address 0xdffffc000000007a: 0000 [#1] PREEMPT SMP KASAN NOPTI
> KASAN: null-ptr-deref in range [0x00000000000003d0-0x00000000000003d7]
> CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.7.0+ #9
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
> RIP: 0010:ttm_device_init+0x10e/0x340
> Call Trace:
>   <TASK>
>   qxl_ttm_init+0xaa/0x310
>   qxl_device_init+0x1071/0x2000
>   qxl_pci_probe+0x167/0x3f0
>   local_pci_probe+0xe1/0x1b0
>   pci_device_probe+0x29d/0x790
>   really_probe+0x251/0x910
>   __driver_probe_device+0x1ea/0x390
>   driver_probe_device+0x4e/0x2e0
>   __driver_attach+0x1e3/0x600
>   bus_for_each_dev+0x12d/0x1c0
>   bus_add_driver+0x25a/0x590
>   driver_register+0x15c/0x4b0
>   qxl_pci_driver_init+0x67/0x80
>   do_one_initcall+0xf5/0x5d0
>   kernel_init_freeable+0x637/0xb10
>   kernel_init+0x1c/0x2e0
>   ret_from_fork+0x48/0x80
>   ret_from_fork_asm+0x1b/0x30
>   </TASK>
> Modules linked in:
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:ttm_device_init+0x10e/0x340
>
> Fall back to NUMA_NO_NODE if there is no device for DMA.
>
> Found by Linux Verification Center (linuxtesting.org).
>
> Fixes: b0a7ce53d494 ("drm/ttm: Schedule delayed_delete worker closer")
> Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>

Oh, thanks for that fix. Reviewed-by: Christian König 
<christian.koenig@amd.com>

Going to push that into -fixes in a minute.

Regards,
Christian.

> ---
>   drivers/gpu/drm/ttm/ttm_device.c | 9 +++++++--
>   1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
> index f5187b384ae9..4130945052ed 100644
> --- a/drivers/gpu/drm/ttm/ttm_device.c
> +++ b/drivers/gpu/drm/ttm/ttm_device.c
> @@ -195,7 +195,7 @@ int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *func
>   		    bool use_dma_alloc, bool use_dma32)
>   {
>   	struct ttm_global *glob = &ttm_glob;
> -	int ret;
> +	int ret, nid;
>   
>   	if (WARN_ON(vma_manager == NULL))
>   		return -EINVAL;
> @@ -215,7 +215,12 @@ int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *func
>   
>   	ttm_sys_man_init(bdev);
>   
> -	ttm_pool_init(&bdev->pool, dev, dev_to_node(dev), use_dma_alloc, use_dma32);
> +	if (dev)
> +		nid = dev_to_node(dev);
> +	else
> +		nid = NUMA_NO_NODE;
> +
> +	ttm_pool_init(&bdev->pool, dev, nid, use_dma_alloc, use_dma32);
>   
>   	bdev->vma_manager = vma_manager;
>   	spin_lock_init(&bdev->lru_lock);
diff mbox series

Patch

diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index f5187b384ae9..4130945052ed 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -195,7 +195,7 @@  int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *func
 		    bool use_dma_alloc, bool use_dma32)
 {
 	struct ttm_global *glob = &ttm_glob;
-	int ret;
+	int ret, nid;
 
 	if (WARN_ON(vma_manager == NULL))
 		return -EINVAL;
@@ -215,7 +215,12 @@  int ttm_device_init(struct ttm_device *bdev, const struct ttm_device_funcs *func
 
 	ttm_sys_man_init(bdev);
 
-	ttm_pool_init(&bdev->pool, dev, dev_to_node(dev), use_dma_alloc, use_dma32);
+	if (dev)
+		nid = dev_to_node(dev);
+	else
+		nid = NUMA_NO_NODE;
+
+	ttm_pool_init(&bdev->pool, dev, nid, use_dma_alloc, use_dma32);
 
 	bdev->vma_manager = vma_manager;
 	spin_lock_init(&bdev->lru_lock);