diff mbox series

drm/buddy: fix issue that force_merge cannot free all roots

Message ID 20240808063812.1293955-1-lincao12@amd.com (mailing list archive)
State New, archived
Headers show
Series drm/buddy: fix issue that force_merge cannot free all roots | expand

Commit Message

Lin.Cao Aug. 8, 2024, 6:38 a.m. UTC
If buddy manager have more than one roots and each root have sub-block
need to be free. When drm_buddy_fini called, the first loop of
force_merge will merge and free all of the sub block of first root,
which offset is 0x0 and size is biggest(more than have of the mm size).
In subsequent force_merge rounds, if we use 0 as start and use remaining
mm size as end, the block of other roots will be skipped in
__force_merge function. It will cause the other roots can not be freed.

Solution: use roots' offset as the start could fix this issue.

Signed-off-by: Lin.Cao <lincao12@amd.com>
---
 drivers/gpu/drm/drm_buddy.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Matthew Auld Aug. 9, 2024, 10:09 a.m. UTC | #1
Hi,

On 08/08/2024 07:38, Lin.Cao wrote:
> If buddy manager have more than one roots and each root have sub-block
> need to be free. When drm_buddy_fini called, the first loop of
> force_merge will merge and free all of the sub block of first root,
> which offset is 0x0 and size is biggest(more than have of the mm size).
> In subsequent force_merge rounds, if we use 0 as start and use remaining
> mm size as end, the block of other roots will be skipped in
> __force_merge function. It will cause the other roots can not be freed.
> 
> Solution: use roots' offset as the start could fix this issue.
> 
> Signed-off-by: Lin.Cao <lincao12@amd.com>

Nice catch.

> ---
>   drivers/gpu/drm/drm_buddy.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
> index 94f8c34fc293..5379687552bc 100644
> --- a/drivers/gpu/drm/drm_buddy.c
> +++ b/drivers/gpu/drm/drm_buddy.c
> @@ -327,12 +327,14 @@ void drm_buddy_fini(struct drm_buddy *mm)
>   	u64 root_size, size;
>   	unsigned int order;
>   	int i;
> +	u64 start = 0;

Nit: We could maybe move this into root_size, size or even into the loop 
body below? Also no need to init.

>   
>   	size = mm->size;
>   
>   	for (i = 0; i < mm->n_roots; ++i) {
>   		order = ilog2(size) - ilog2(mm->chunk_size);
> -		__force_merge(mm, 0, size, order);
> +		start = drm_buddy_block_offset(mm->roots[i]);
> +		__force_merge(mm, start, start + size, order);
>   
>   		WARN_ON(!drm_buddy_block_is_free(mm->roots[i]));

We do seem to have a testcase for this at the bottom of 
drm_test_buddy_alloc_clear(), so either it is not triggering the 
WARN_ON() here in which case we should maybe improve that. Or it is, but 
kunit doesn't treat that as a test failure? Maybe we can call something 
like kunit_fail_current_test() here if that WARN_ON is triggered?

For reference our CI is just running all drm selftests with:

/kernel/tools/testing/kunit/kunit.py run --kunitconfig 
/kernel/drivers/gpu/drm/tests/.kunitconfig

>   		drm_block_free(mm, mm->roots[i]);
diff mbox series

Patch

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 94f8c34fc293..5379687552bc 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -327,12 +327,14 @@  void drm_buddy_fini(struct drm_buddy *mm)
 	u64 root_size, size;
 	unsigned int order;
 	int i;
+	u64 start = 0;
 
 	size = mm->size;
 
 	for (i = 0; i < mm->n_roots; ++i) {
 		order = ilog2(size) - ilog2(mm->chunk_size);
-		__force_merge(mm, 0, size, order);
+		start = drm_buddy_block_offset(mm->roots[i]);
+		__force_merge(mm, start, start + size, order);
 
 		WARN_ON(!drm_buddy_block_is_free(mm->roots[i]));
 		drm_block_free(mm, mm->roots[i]);