diff mbox series

drm/ttm: Don't delete the system manager before the delayed delete

Message ID 20210917175328.694429-1-zackr@vmware.com (mailing list archive)
State New, archived
Headers show
Series drm/ttm: Don't delete the system manager before the delayed delete | expand

Commit Message

Zack Rusin Sept. 17, 2021, 5:53 p.m. UTC
On some hardware, in particular in virtualized environments, the
system memory can be shared with the "hardware". In those cases
the BO's allocated through the ttm system manager might be
busy during ttm_bo_put which results in them being scheduled
for a delayed deletion.

The problem is that that the ttm system manager is disabled
before the final delayed deletion is ran in ttm_device_fini.
This results in crashes during freeing of the BO resources
because they're trying to remove themselves from a no longer
existent ttm_resource_manager (e.g. in IGT's core_hotunplug
on vmwgfx)

In general reloading any driver that could share system mem
resources with "hardware" could hit it because nothing
prevents the system mem resources from being scheduled
for delayed deletion (apart from them not being busy probably
anywhere apart from virtualized environments).

Signed-off-by: Zack Rusin <zackr@vmware.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
---
 drivers/gpu/drm/ttm/ttm_device.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Andrey Grodzovsky Sept. 17, 2021, 6:34 p.m. UTC | #1
On 2021-09-17 1:53 p.m., Zack Rusin wrote:

> On some hardware, in particular in virtualized environments, the
> system memory can be shared with the "hardware". In those cases
> the BO's allocated through the ttm system manager might be
> busy during ttm_bo_put which results in them being scheduled
> for a delayed deletion.
>
> The problem is that that the ttm system manager is disabled
> before the final delayed deletion is ran in ttm_device_fini.
> This results in crashes during freeing of the BO resources
> because they're trying to remove themselves from a no longer
> existent ttm_resource_manager (e.g. in IGT's core_hotunplug
> on vmwgfx)
>
> In general reloading any driver that could share system mem
> resources with "hardware" could hit it because nothing
> prevents the system mem resources from being scheduled
> for delayed deletion (apart from them not being busy probably
> anywhere apart from virtualized environments).
>
> Signed-off-by: Zack Rusin <zackr@vmware.com>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Cc: Huang Rui <ray.huang@amd.com>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: dri-devel@lists.freedesktop.org
> ---
>   drivers/gpu/drm/ttm/ttm_device.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
> index 9eb8f54b66fc..4ef19cafc755 100644
> --- a/drivers/gpu/drm/ttm/ttm_device.c
> +++ b/drivers/gpu/drm/ttm/ttm_device.c
> @@ -225,10 +225,6 @@ void ttm_device_fini(struct ttm_device *bdev)
>   	struct ttm_resource_manager *man;
>   	unsigned i;
>   
> -	man = ttm_manager_type(bdev, TTM_PL_SYSTEM);
> -	ttm_resource_manager_set_used(man, false);
> -	ttm_set_driver_manager(bdev, TTM_PL_SYSTEM, NULL);
> -
>   	mutex_lock(&ttm_global_mutex);
>   	list_del(&bdev->device_list);
>   	mutex_unlock(&ttm_global_mutex);
> @@ -238,6 +234,10 @@ void ttm_device_fini(struct ttm_device *bdev)
>   	if (ttm_bo_delayed_delete(bdev, true))
>   		pr_debug("Delayed destroy list was clean\n");
>   
> +	man = ttm_manager_type(bdev, TTM_PL_SYSTEM);
> +	ttm_resource_manager_set_used(man, false);
> +	ttm_set_driver_manager(bdev, TTM_PL_SYSTEM, NULL);
> +


Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

Andrey


>   	spin_lock(&bdev->lru_lock);
>   	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i)
>   		if (list_empty(&man->lru[0]))
Christian König Sept. 20, 2021, 6:30 a.m. UTC | #2
Am 17.09.21 um 19:53 schrieb Zack Rusin:
> On some hardware, in particular in virtualized environments, the
> system memory can be shared with the "hardware". In those cases
> the BO's allocated through the ttm system manager might be
> busy during ttm_bo_put which results in them being scheduled
> for a delayed deletion.

While the patch itself is probably fine the reasoning here is a clear NAK.

Buffers in the system domain are not GPU accessible by definition, even 
in a shared environment and so *must* be idle.

Otherwise you break quite a number of assumptions in the code.

Regards,
Christian.

>
> The problem is that that the ttm system manager is disabled
> before the final delayed deletion is ran in ttm_device_fini.
> This results in crashes during freeing of the BO resources
> because they're trying to remove themselves from a no longer
> existent ttm_resource_manager (e.g. in IGT's core_hotunplug
> on vmwgfx)
>
> In general reloading any driver that could share system mem
> resources with "hardware" could hit it because nothing
> prevents the system mem resources from being scheduled
> for delayed deletion (apart from them not being busy probably
> anywhere apart from virtualized environments).
>
> Signed-off-by: Zack Rusin <zackr@vmware.com>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Cc: Huang Rui <ray.huang@amd.com>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: dri-devel@lists.freedesktop.org
> ---
>   drivers/gpu/drm/ttm/ttm_device.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
> index 9eb8f54b66fc..4ef19cafc755 100644
> --- a/drivers/gpu/drm/ttm/ttm_device.c
> +++ b/drivers/gpu/drm/ttm/ttm_device.c
> @@ -225,10 +225,6 @@ void ttm_device_fini(struct ttm_device *bdev)
>   	struct ttm_resource_manager *man;
>   	unsigned i;
>   
> -	man = ttm_manager_type(bdev, TTM_PL_SYSTEM);
> -	ttm_resource_manager_set_used(man, false);
> -	ttm_set_driver_manager(bdev, TTM_PL_SYSTEM, NULL);
> -
>   	mutex_lock(&ttm_global_mutex);
>   	list_del(&bdev->device_list);
>   	mutex_unlock(&ttm_global_mutex);
> @@ -238,6 +234,10 @@ void ttm_device_fini(struct ttm_device *bdev)
>   	if (ttm_bo_delayed_delete(bdev, true))
>   		pr_debug("Delayed destroy list was clean\n");
>   
> +	man = ttm_manager_type(bdev, TTM_PL_SYSTEM);
> +	ttm_resource_manager_set_used(man, false);
> +	ttm_set_driver_manager(bdev, TTM_PL_SYSTEM, NULL);
> +
>   	spin_lock(&bdev->lru_lock);
>   	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i)
>   		if (list_empty(&man->lru[0]))
Zack Rusin Sept. 20, 2021, 2:59 p.m. UTC | #3
> On Sep 20, 2021, at 02:30, Christian König <christian.koenig@amd.com> wrote:
> 
> Am 17.09.21 um 19:53 schrieb Zack Rusin:
>> On some hardware, in particular in virtualized environments, the
>> system memory can be shared with the "hardware". In those cases
>> the BO's allocated through the ttm system manager might be
>> busy during ttm_bo_put which results in them being scheduled
>> for a delayed deletion.
> 
> While the patch itself is probably fine the reasoning here is a clear NAK.
> 
> Buffers in the system domain are not GPU accessible by definition, even in a shared environment and so *must* be idle.

I’m assuming that means they are not allowed to be ever fenced then, yes?

> Otherwise you break quite a number of assumptions in the code.

Are there more assumptions like that or do you mean there’s more places that depend on the assumption that system domain bo’s are always idle? If there’s more assumptions like that in TTM that would be incredibly valuable to know. I haven’t been paying much attention to the kernel code in years and coming back now and looking at a few years old vmwgfx code it’s almost impossible to tell the difference between: “this assumption breaks the driver” and “this driver breaks this assumption”.

z
Zack Rusin Sept. 23, 2021, 1:53 p.m. UTC | #4
On 9/20/21 10:59 AM, Zack Rusin wrote:
>> On Sep 20, 2021, at 02:30, Christian König <christian.koenig@amd.com> wrote:
>>
>> Am 17.09.21 um 19:53 schrieb Zack Rusin:
>>> On some hardware, in particular in virtualized environments, the
>>> system memory can be shared with the "hardware". In those cases
>>> the BO's allocated through the ttm system manager might be
>>> busy during ttm_bo_put which results in them being scheduled
>>> for a delayed deletion.
>>
>> While the patch itself is probably fine the reasoning here is a clear NAK.
>>
>> Buffers in the system domain are not GPU accessible by definition, even in a shared environment and so *must* be idle.
> 
> I’m assuming that means they are not allowed to be ever fenced then, yes?

Any thoughts on this? I'd love a confirmation because it would mean I need to go and rewrite the vmwgfx_mob.c bits where we use TTM_PL_SYSTEM memory (through vmw_bo_create_and_populate) for a page table which is read by the host, and those bo's need to be fenced to prevent destruction of the page tables while the memory they point to is still used. So if those were never allowed to be fenced in the first place we probably need to add a new memory type to hold those page tables.

z
Christian König Sept. 23, 2021, 2:49 p.m. UTC | #5
Am 23.09.21 um 15:53 schrieb Zack Rusin:
> On 9/20/21 10:59 AM, Zack Rusin wrote:
>>> On Sep 20, 2021, at 02:30, Christian König 
>>> <christian.koenig@amd.com> wrote:
>>>
>>> Am 17.09.21 um 19:53 schrieb Zack Rusin:
>>>> On some hardware, in particular in virtualized environments, the
>>>> system memory can be shared with the "hardware". In those cases
>>>> the BO's allocated through the ttm system manager might be
>>>> busy during ttm_bo_put which results in them being scheduled
>>>> for a delayed deletion.
>>>
>>> While the patch itself is probably fine the reasoning here is a 
>>> clear NAK.
>>>
>>> Buffers in the system domain are not GPU accessible by definition, 
>>> even in a shared environment and so *must* be idle.
>>
>> I’m assuming that means they are not allowed to be ever fenced then, 
>> yes?
>
> Any thoughts on this? I'd love a confirmation because it would mean I 
> need to go and rewrite the vmwgfx_mob.c bits where we use 
> TTM_PL_SYSTEM memory (through vmw_bo_create_and_populate) for a page 
> table which is read by the host, and those bo's need to be fenced to 
> prevent destruction of the page tables while the memory they point to 
> is still used. So if those were never allowed to be fenced in the 
> first place we probably need to add a new memory type to hold those 
> page tables.

Yeah, as far as I can see that is pretty much illegal from a design 
point of view.

We could probably change that rule on the TTM side, but I think that 
keeping the design as it is and adding a placement in vmwgfx sounds like 
the cleaner approach.

Christian.

>
> z
diff mbox series

Patch

diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c
index 9eb8f54b66fc..4ef19cafc755 100644
--- a/drivers/gpu/drm/ttm/ttm_device.c
+++ b/drivers/gpu/drm/ttm/ttm_device.c
@@ -225,10 +225,6 @@  void ttm_device_fini(struct ttm_device *bdev)
 	struct ttm_resource_manager *man;
 	unsigned i;
 
-	man = ttm_manager_type(bdev, TTM_PL_SYSTEM);
-	ttm_resource_manager_set_used(man, false);
-	ttm_set_driver_manager(bdev, TTM_PL_SYSTEM, NULL);
-
 	mutex_lock(&ttm_global_mutex);
 	list_del(&bdev->device_list);
 	mutex_unlock(&ttm_global_mutex);
@@ -238,6 +234,10 @@  void ttm_device_fini(struct ttm_device *bdev)
 	if (ttm_bo_delayed_delete(bdev, true))
 		pr_debug("Delayed destroy list was clean\n");
 
+	man = ttm_manager_type(bdev, TTM_PL_SYSTEM);
+	ttm_resource_manager_set_used(man, false);
+	ttm_set_driver_manager(bdev, TTM_PL_SYSTEM, NULL);
+
 	spin_lock(&bdev->lru_lock);
 	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i)
 		if (list_empty(&man->lru[0]))