diff mbox

drm/radeon: Clean up radeon_uvd_force_into_uvd_segment

Message ID 1414488529-3492-1-git-send-email-michel@daenzer.net (mailing list archive)
State New, archived
Headers show

Commit Message

Michel Dänzer Oct. 28, 2014, 9:28 a.m. UTC
From: Michel Dänzer <michel.daenzer@amd.com>

It was adding a second placement for the second 256MB segment of VRAM,
which is not a good idea for several reasons:

* It fills up the first 256MB segment (which is also typically the CPU
  accessible part) of VRAM first, even for BOs which could go into the
  second 256MB segment. Only once there is no space in the first segment
  does it fall back to the second segment.
* It doesn't work with RADEON_GEM_NO_CPU_ACCESS BOs, which already use
  two VRAM placements.

Change it to instead restrict the range for each VRAM placement. If the
BO can go into the second 256MB segment, set up the range to include
both segments, and set the TTM_PL_FLAG_TOPDOWN flag. That should result
in preferring the second segment for those BOs, falling back to the
first segment.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
---
 drivers/gpu/drm/radeon/radeon_uvd.c | 31 +++++++++++++++----------------
 1 file changed, 15 insertions(+), 16 deletions(-)

Comments

Christian König Oct. 28, 2014, 10:32 a.m. UTC | #1
Am 28.10.2014 um 10:28 schrieb Michel Dänzer:
> From: Michel Dänzer <michel.daenzer@amd.com>
>
> It was adding a second placement for the second 256MB segment of VRAM,
> which is not a good idea for several reasons:
>
> * It fills up the first 256MB segment (which is also typically the CPU
>    accessible part) of VRAM first, even for BOs which could go into the
>    second 256MB segment. Only once there is no space in the first segment
>    does it fall back to the second segment.
> * It doesn't work with RADEON_GEM_NO_CPU_ACCESS BOs, which already use
>    two VRAM placements.
>
> Change it to instead restrict the range for each VRAM placement. If the
> BO can go into the second 256MB segment, set up the range to include
> both segments, and set the TTM_PL_FLAG_TOPDOWN flag. That should result
> in preferring the second segment for those BOs, falling back to the
> first segment.
>
> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>

I'm not sure if this will work correctly. Please keep in mind that even 
if BOs can be in the second segment they are not allowed to cross 
segment borders.

E.g. if you just set lpfn = (2 * 256 * 1024 * 1024) >> PAGE_SHIFT it 
might happen that the first halve of a BO lands in the first 256MB 
segment and the second halve of a BO in the second 256MB segment.

Have you considered that as well?

Regards,
Christian.

> ---
>   drivers/gpu/drm/radeon/radeon_uvd.c | 31 +++++++++++++++----------------
>   1 file changed, 15 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/radeon_uvd.c b/drivers/gpu/drm/radeon/radeon_uvd.c
> index 11b6624..eca0ea96 100644
> --- a/drivers/gpu/drm/radeon/radeon_uvd.c
> +++ b/drivers/gpu/drm/radeon/radeon_uvd.c
> @@ -259,27 +259,26 @@ int radeon_uvd_resume(struct radeon_device *rdev)
>   void radeon_uvd_force_into_uvd_segment(struct radeon_bo *rbo,
>   				       uint32_t allowed_domains)
>   {
> +	unsigned lpfn;
>   	int i;
>   
> -	for (i = 0; i < rbo->placement.num_placement; ++i) {
> -		rbo->placements[i].fpfn = 0 >> PAGE_SHIFT;
> -		rbo->placements[i].lpfn = (256 * 1024 * 1024) >> PAGE_SHIFT;
> -	}
> -
> -	/* If it must be in VRAM it must be in the first segment as well */
>   	if (allowed_domains == RADEON_GEM_DOMAIN_VRAM)
> -		return;
> +		/* If it must be in VRAM, it must be in the first 256MB segment */
> +		lpfn = (256 * 1024 * 1024) >> PAGE_SHIFT;
> +	else
> +		/* Allow second 256MB segment as well */
> +		lpfn = (2 * 256 * 1024 * 1024) >> PAGE_SHIFT;
>   
> -	/* abort if we already have more than one placement */
> -	if (rbo->placement.num_placement > 1)
> -		return;
> +	for (i = 0; i < rbo->placement.num_placement; ++i) {
> +		if (!(rbo->placements[i].flags & TTM_PL_FLAG_VRAM))
> +			continue;
>   
> -	/* add another 256MB segment */
> -	rbo->placements[1] = rbo->placements[0];
> -	rbo->placements[1].fpfn += (256 * 1024 * 1024) >> PAGE_SHIFT;
> -	rbo->placements[1].lpfn += (256 * 1024 * 1024) >> PAGE_SHIFT;
> -	rbo->placement.num_placement++;
> -	rbo->placement.num_busy_placement++;
> +		if (allowed_domains != RADEON_GEM_DOMAIN_VRAM)
> +			rbo->placements[i].flags |= TTM_PL_FLAG_TOPDOWN;
> +
> +		if (!rbo->placements[i].lpfn || rbo->placements[i].lpfn > lpfn)
> +			rbo->placements[i].lpfn = lpfn;
> +	}
>   }
>   
>   void radeon_uvd_free_handles(struct radeon_device *rdev, struct drm_file *filp)
Michel Dänzer Oct. 29, 2014, 8:58 a.m. UTC | #2
On 28.10.2014 19:32, Christian König wrote:
> Am 28.10.2014 um 10:28 schrieb Michel Dänzer:
>> From: Michel Dänzer <michel.daenzer@amd.com>
>>
>> It was adding a second placement for the second 256MB segment of VRAM,
>> which is not a good idea for several reasons:
>>
>> * It fills up the first 256MB segment (which is also typically the CPU
>>    accessible part) of VRAM first, even for BOs which could go into the
>>    second 256MB segment. Only once there is no space in the first segment
>>    does it fall back to the second segment.
>> * It doesn't work with RADEON_GEM_NO_CPU_ACCESS BOs, which already use
>>    two VRAM placements.
>>
>> Change it to instead restrict the range for each VRAM placement. If the
>> BO can go into the second 256MB segment, set up the range to include
>> both segments, and set the TTM_PL_FLAG_TOPDOWN flag. That should result
>> in preferring the second segment for those BOs, falling back to the
>> first segment.
>>
>> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
>
> I'm not sure if this will work correctly. Please keep in mind that even
> if BOs can be in the second segment they are not allowed to cross
> segment borders.
>
> E.g. if you just set lpfn = (2 * 256 * 1024 * 1024) >> PAGE_SHIFT it
> might happen that the first halve of a BO lands in the first 256MB
> segment and the second halve of a BO in the second 256MB segment.
>
> Have you considered that as well?

No, I wasn't aware of that restriction.

Looking at the current code again, it returns early if (allowed_domains 
== RADEON_GEM_DOMAIN_VRAM || rbo->placement.num_placement > 1). I think 
both of these conditions can only be false if allowed_domains == 
RADEON_GEM_DOMAIN_GTT, so can the second 256MB segment only be used for GTT?
diff mbox

Patch

diff --git a/drivers/gpu/drm/radeon/radeon_uvd.c b/drivers/gpu/drm/radeon/radeon_uvd.c
index 11b6624..eca0ea96 100644
--- a/drivers/gpu/drm/radeon/radeon_uvd.c
+++ b/drivers/gpu/drm/radeon/radeon_uvd.c
@@ -259,27 +259,26 @@  int radeon_uvd_resume(struct radeon_device *rdev)
 void radeon_uvd_force_into_uvd_segment(struct radeon_bo *rbo,
 				       uint32_t allowed_domains)
 {
+	unsigned lpfn;
 	int i;
 
-	for (i = 0; i < rbo->placement.num_placement; ++i) {
-		rbo->placements[i].fpfn = 0 >> PAGE_SHIFT;
-		rbo->placements[i].lpfn = (256 * 1024 * 1024) >> PAGE_SHIFT;
-	}
-
-	/* If it must be in VRAM it must be in the first segment as well */
 	if (allowed_domains == RADEON_GEM_DOMAIN_VRAM)
-		return;
+		/* If it must be in VRAM, it must be in the first 256MB segment */
+		lpfn = (256 * 1024 * 1024) >> PAGE_SHIFT;
+	else
+		/* Allow second 256MB segment as well */
+		lpfn = (2 * 256 * 1024 * 1024) >> PAGE_SHIFT;
 
-	/* abort if we already have more than one placement */
-	if (rbo->placement.num_placement > 1)
-		return;
+	for (i = 0; i < rbo->placement.num_placement; ++i) {
+		if (!(rbo->placements[i].flags & TTM_PL_FLAG_VRAM))
+			continue;
 
-	/* add another 256MB segment */
-	rbo->placements[1] = rbo->placements[0];
-	rbo->placements[1].fpfn += (256 * 1024 * 1024) >> PAGE_SHIFT;
-	rbo->placements[1].lpfn += (256 * 1024 * 1024) >> PAGE_SHIFT;
-	rbo->placement.num_placement++;
-	rbo->placement.num_busy_placement++;
+		if (allowed_domains != RADEON_GEM_DOMAIN_VRAM)
+			rbo->placements[i].flags |= TTM_PL_FLAG_TOPDOWN;
+
+		if (!rbo->placements[i].lpfn || rbo->placements[i].lpfn > lpfn)
+			rbo->placements[i].lpfn = lpfn;
+	}
 }
 
 void radeon_uvd_free_handles(struct radeon_device *rdev, struct drm_file *filp)