diff mbox series

[1/2] drm/radeon: Don't use WC for VRAM if !RADEON_GEM_GTT_WC

Message ID 1599625320-17637-1-git-send-email-chenhc@lemote.com (mailing list archive)
State Not Applicable
Headers show
Series [1/2] drm/radeon: Don't use WC for VRAM if !RADEON_GEM_GTT_WC | expand

Commit Message

Huacai Chen Sept. 9, 2020, 4:21 a.m. UTC
Though RADEON_GEM_GTT_WC is initially used for GTT, but this flag is
bound to drm_arch_can_wc_memory(), and if arch doesn't support WC, then
VRAM should not use WC.

Signed-off-by: Huacai Chen <chenhc@lemote.com>
---
 drivers/gpu/drm/radeon/radeon_object.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

Comments

Sergei Shtylyov Sept. 9, 2020, 8:05 a.m. UTC | #1
Hello!

On 09.09.2020 7:21, Huacai Chen wrote:

> Though RADEON_GEM_GTT_WC is initially used for GTT, but this flag is
> bound to drm_arch_can_wc_memory(), and if arch doesn't support WC, then
> VRAM should not use WC.
> 
> Signed-off-by: Huacai Chen <chenhc@lemote.com>
> ---
>   drivers/gpu/drm/radeon/radeon_object.c | 14 ++++++++++----
>   1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
> index f3dee01..07b82d9 100644
> --- a/drivers/gpu/drm/radeon/radeon_object.c
> +++ b/drivers/gpu/drm/radeon/radeon_object.c
> @@ -117,10 +117,16 @@ void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain)
>   						     TTM_PL_FLAG_VRAM;
>   		}
>   
> -		rbo->placements[c].fpfn = 0;
> -		rbo->placements[c++].flags = TTM_PL_FLAG_WC |
> -					     TTM_PL_FLAG_UNCACHED |
> -					     TTM_PL_FLAG_VRAM;
> +		if (rbo->flags & RADEON_GEM_GTT_WC) {
> +			rbo->placements[c].fpfn = 0;

    Shouldn't this statement be placed outside *if* as before?

> +			rbo->placements[c++].flags = TTM_PL_FLAG_WC |
> +						     TTM_PL_FLAG_UNCACHED |
> +						     TTM_PL_FLAG_VRAM;
> +		} else {
> +			rbo->placements[c].fpfn = 0;
> +			rbo->placements[c++].flags = TTM_PL_FLAG_UNCACHED |
> +						     TTM_PL_FLAG_VRAM;
> +		}
>   	}
>   
>   	if (domain & RADEON_GEM_DOMAIN_GTT) {

MBR, Sergei
Tiezhu Yang Sept. 9, 2020, 9:39 a.m. UTC | #2
On 09/09/2020 12:21 PM, Huacai Chen wrote:
> Though RADEON_GEM_GTT_WC is initially used for GTT, but this flag is
> bound to drm_arch_can_wc_memory(), and if arch doesn't support WC, then
> VRAM should not use WC.

+cc RADEON and AMDGPU DRM DRIVERS maintainer
Alex Deucher <alexander.deucher@amd.com>
Christian König <christian.koenig@amd.com>
amd-gfx@lists.freedesktop.org

Hi all,

In the current code, if CONFIG_CPU_LOONGSON64 is set, 
drm_arch_can_wc_memory()
returns false, and then bo->flags clears the flag RADEON_GEM_GTT_WC, so with
this patch, TTM_PL_FLAG_WC of VRAM is removed on the Loongson platform,
the writecombine issue for Loongson64 can be fixed [1].

I find this is done by commit 221004c66a58 ("drm: Loongson-3 doesn't fully
support wc memory"), but I want to know why drm_arch_can_wc_memory() returns
false for Loongson64, is there some historical reasons?

include/drm/drm_cache.h
static inline bool drm_arch_can_wc_memory(void)
{
#if defined(CONFIG_PPC) && !defined(CONFIG_NOT_COHERENT_CACHE)
     return false;
#elif defined(CONFIG_MIPS) && defined(CONFIG_CPU_LOONGSON64)
     return false;
#elif defined(CONFIG_ARM) || defined(CONFIG_ARM64)
...
}

drivers/gpu/drm/radeon/radeon_object.c
int radeon_bo_create()
{
...
     if (!drm_arch_can_wc_memory())
         bo->flags &= ~RADEON_GEM_GTT_WC;
...
}

[1] https://lore.kernel.org/patchwork/patch/1285542/
gpu/drm: Remove TTM_PL_FLAG_WC of VRAM to fix writecombine issue for 
Loongson64
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=221004c66a58

Thanks,
Tiezhu

>
> Signed-off-by: Huacai Chen <chenhc@lemote.com>
> ---
>   drivers/gpu/drm/radeon/radeon_object.c | 14 ++++++++++----
>   1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
> index f3dee01..07b82d9 100644
> --- a/drivers/gpu/drm/radeon/radeon_object.c
> +++ b/drivers/gpu/drm/radeon/radeon_object.c
> @@ -117,10 +117,16 @@ void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain)
>   						     TTM_PL_FLAG_VRAM;
>   		}
>   
> -		rbo->placements[c].fpfn = 0;
> -		rbo->placements[c++].flags = TTM_PL_FLAG_WC |
> -					     TTM_PL_FLAG_UNCACHED |
> -					     TTM_PL_FLAG_VRAM;
> +		if (rbo->flags & RADEON_GEM_GTT_WC) {
> +			rbo->placements[c].fpfn = 0;
> +			rbo->placements[c++].flags = TTM_PL_FLAG_WC |
> +						     TTM_PL_FLAG_UNCACHED |
> +						     TTM_PL_FLAG_VRAM;
> +		} else {
> +			rbo->placements[c].fpfn = 0;
> +			rbo->placements[c++].flags = TTM_PL_FLAG_UNCACHED |
> +						     TTM_PL_FLAG_VRAM;
> +		}
>   	}
>   
>   	if (domain & RADEON_GEM_DOMAIN_GTT) {
Christian König Sept. 9, 2020, 11 a.m. UTC | #3
Am 09.09.20 um 11:39 schrieb Tiezhu Yang:
> On 09/09/2020 12:21 PM, Huacai Chen wrote:
>> Though RADEON_GEM_GTT_WC is initially used for GTT, but this flag is
>> bound to drm_arch_can_wc_memory(), and if arch doesn't support WC, then
>> VRAM should not use WC.
>
> +cc RADEON and AMDGPU DRM DRIVERS maintainer
> Alex Deucher <alexander.deucher@amd.com>
> Christian König <christian.koenig@amd.com>
> amd-gfx@lists.freedesktop.org
>
> Hi all,
>
> In the current code, if CONFIG_CPU_LOONGSON64 is set, 
> drm_arch_can_wc_memory()
> returns false, and then bo->flags clears the flag RADEON_GEM_GTT_WC, 
> so with
> this patch, TTM_PL_FLAG_WC of VRAM is removed on the Loongson platform,
> the writecombine issue for Loongson64 can be fixed [1].

And broken for mostly all other platforms. The patch is complete nonsense.

See the RADEON_GEM_GTT_WC flag means that that system memory can be 
mapped WC instead of cached, but here it is used to map MEMIO uncached 
instead of WC.

> I find this is done by commit 221004c66a58 ("drm: Loongson-3 doesn't 
> fully
> support wc memory"), but I want to know why drm_arch_can_wc_memory() 
> returns
> false for Loongson64, is there some historical reasons?

It looks like Loongson has a platform bug which prevents the PCIe 
extension which allows unsnooped system memory access to work correctly.

Regards,
Christian.

>
> include/drm/drm_cache.h
> static inline bool drm_arch_can_wc_memory(void)
> {
> #if defined(CONFIG_PPC) && !defined(CONFIG_NOT_COHERENT_CACHE)
>     return false;
> #elif defined(CONFIG_MIPS) && defined(CONFIG_CPU_LOONGSON64)
>     return false;
> #elif defined(CONFIG_ARM) || defined(CONFIG_ARM64)
> ...
> }
>
> drivers/gpu/drm/radeon/radeon_object.c
> int radeon_bo_create()
> {
> ...
>     if (!drm_arch_can_wc_memory())
>         bo->flags &= ~RADEON_GEM_GTT_WC;
> ...
> }
>
> [1] 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fpatchwork%2Fpatch%2F1285542%2F&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Ca60b0da3c62448f4b5fe08d854a43420%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637352411519749179&amp;sdata=kiXLj58%2Boma5r5EYRuEsLg%2FwqkXpHtBfxLWVUgC%2B2g0%3D&amp;reserved=0
> gpu/drm: Remove TTM_PL_FLAG_WC of VRAM to fix writecombine issue for 
> Loongson64
> [2] 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3D221004c66a58&amp;data=02%7C01%7Cchristian.koenig%40amd.com%7Ca60b0da3c62448f4b5fe08d854a43420%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637352411519749179&amp;sdata=ekNulVe555%2BVnnsqSx5N0wwH%2BepaBGfzBHQnxddbO5Y%3D&amp;reserved=0
>
> Thanks,
> Tiezhu
>
>>
>> Signed-off-by: Huacai Chen <chenhc@lemote.com>
>> ---
>>   drivers/gpu/drm/radeon/radeon_object.c | 14 ++++++++++----
>>   1 file changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/radeon/radeon_object.c 
>> b/drivers/gpu/drm/radeon/radeon_object.c
>> index f3dee01..07b82d9 100644
>> --- a/drivers/gpu/drm/radeon/radeon_object.c
>> +++ b/drivers/gpu/drm/radeon/radeon_object.c
>> @@ -117,10 +117,16 @@ void radeon_ttm_placement_from_domain(struct 
>> radeon_bo *rbo, u32 domain)
>>                                TTM_PL_FLAG_VRAM;
>>           }
>>   -        rbo->placements[c].fpfn = 0;
>> -        rbo->placements[c++].flags = TTM_PL_FLAG_WC |
>> -                         TTM_PL_FLAG_UNCACHED |
>> -                         TTM_PL_FLAG_VRAM;
>> +        if (rbo->flags & RADEON_GEM_GTT_WC) {
>> +            rbo->placements[c].fpfn = 0;
>> +            rbo->placements[c++].flags = TTM_PL_FLAG_WC |
>> +                             TTM_PL_FLAG_UNCACHED |
>> +                             TTM_PL_FLAG_VRAM;
>> +        } else {
>> +            rbo->placements[c].fpfn = 0;
>> +            rbo->placements[c++].flags = TTM_PL_FLAG_UNCACHED |
>> +                             TTM_PL_FLAG_VRAM;
>> +        }
>>       }
>>         if (domain & RADEON_GEM_DOMAIN_GTT) {
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
index f3dee01..07b82d9 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -117,10 +117,16 @@  void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain)
 						     TTM_PL_FLAG_VRAM;
 		}
 
-		rbo->placements[c].fpfn = 0;
-		rbo->placements[c++].flags = TTM_PL_FLAG_WC |
-					     TTM_PL_FLAG_UNCACHED |
-					     TTM_PL_FLAG_VRAM;
+		if (rbo->flags & RADEON_GEM_GTT_WC) {
+			rbo->placements[c].fpfn = 0;
+			rbo->placements[c++].flags = TTM_PL_FLAG_WC |
+						     TTM_PL_FLAG_UNCACHED |
+						     TTM_PL_FLAG_VRAM;
+		} else {
+			rbo->placements[c].fpfn = 0;
+			rbo->placements[c++].flags = TTM_PL_FLAG_UNCACHED |
+						     TTM_PL_FLAG_VRAM;
+		}
 	}
 
 	if (domain & RADEON_GEM_DOMAIN_GTT) {