diff mbox

[RFC] drm/radeon: clear WC flag when moving bo from vram to gtt

Message ID 1493026928-30134-1-git-send-email-jisorce@oblong.com (mailing list archive)
State New, archived
Headers show

Commit Message

Julien Isorce April 24, 2017, 9:42 a.m. UTC
But re-add the flag is the bo is moved back to vram.

This fixes "ring 0/3 stalled" issue which happens when the driver
evicts bo from vram to gtt, at least on TAHITI and CAPVERDE.

I do not know the exact reason among the following:
  - si_copy_dma from vram to gtt is slow if WC
	(only for the non-visible part ? specific cases ?)
  - Allow snooping (SNOOPED flag from radeon_vm_bo_update).
  - WC should not be set at all for bo in the GTT
        (same reason why WC is only set for vram domain,
         see mesa::r600_init_resource_fields since mesa commit 5b6a0b7
             "gallium/radeon: set GTT WC on tiled textures")
  - Bug in WC
  - Same reason as why radeon_sa_bo_manager_init is not passing
    WC flags if older than CHIP_BONAIRE (see 810b73d1
        drm/radeon: Use write-combined CPU mappings of IBs on >= CIK)
  - Same as b738ca5d
        Revert "drm/radeon: Use write-combined CPU mappings of ring ..."
  - Same as 96ea47c0
        drm/radeon: Disable uncacheable CPU mappings of GTT with RV6xx
        see https://bugs.freedesktop.org/show_bug.cgi?id=91268#c2

https://bugs.freedesktop.org/show_bug.cgi?id=100712
---
 drivers/gpu/drm/radeon/radeon.h        |  1 +
 drivers/gpu/drm/radeon/radeon_object.c |  1 +
 drivers/gpu/drm/radeon/radeon_ttm.c    | 13 +++++++++++++
 3 files changed, 15 insertions(+)

Comments

Christian König April 24, 2017, 9:51 a.m. UTC | #1
Am 24.04.2017 um 11:42 schrieb Julien Isorce:
> But re-add the flag is the bo is moved back to vram.
>
> This fixes "ring 0/3 stalled" issue which happens when the driver
> evicts bo from vram to gtt, at least on TAHITI and CAPVERDE.

Interesting find, but NAK on the approach for fixing it.

If WC mappings don't work for TAHITI and CAPVERDE we need to figure out 
why or at least disable them for those hardware generations in general.

Disabling WC for BOs swapped out from VRAM won't buy us much if the BO 
was initially created in GTT anyway.

Christian.

>
> I do not know the exact reason among the following:
>    - si_copy_dma from vram to gtt is slow if WC
> 	(only for the non-visible part ? specific cases ?)
>    - Allow snooping (SNOOPED flag from radeon_vm_bo_update).
>    - WC should not be set at all for bo in the GTT
>          (same reason why WC is only set for vram domain,
>           see mesa::r600_init_resource_fields since mesa commit 5b6a0b7
>               "gallium/radeon: set GTT WC on tiled textures")
>    - Bug in WC
>    - Same reason as why radeon_sa_bo_manager_init is not passing
>      WC flags if older than CHIP_BONAIRE (see 810b73d1
>          drm/radeon: Use write-combined CPU mappings of IBs on >= CIK)
>    - Same as b738ca5d
>          Revert "drm/radeon: Use write-combined CPU mappings of ring ..."
>    - Same as 96ea47c0
>          drm/radeon: Disable uncacheable CPU mappings of GTT with RV6xx
>          see https://bugs.freedesktop.org/show_bug.cgi?id=91268#c2
>
> https://bugs.freedesktop.org/show_bug.cgi?id=100712
> ---
>   drivers/gpu/drm/radeon/radeon.h        |  1 +
>   drivers/gpu/drm/radeon/radeon_object.c |  1 +
>   drivers/gpu/drm/radeon/radeon_ttm.c    | 13 +++++++++++++
>   3 files changed, 15 insertions(+)
>
> diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
> index 7a39a35..9847f4e 100644
> --- a/drivers/gpu/drm/radeon/radeon.h
> +++ b/drivers/gpu/drm/radeon/radeon.h
> @@ -518,6 +518,7 @@ struct radeon_bo {
>   
>   	struct radeon_mn		*mn;
>   	struct list_head		mn_list;
> +	u32				vram_flags;
>   };
>   #define gem_to_radeon_bo(gobj) container_of((gobj), struct radeon_bo, gem_base)
>   
> diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
> index a557869..870f6b0 100644
> --- a/drivers/gpu/drm/radeon/radeon_object.c
> +++ b/drivers/gpu/drm/radeon/radeon_object.c
> @@ -228,6 +228,7 @@ int radeon_bo_create(struct radeon_device *rdev,
>   				       RADEON_GEM_DOMAIN_CPU);
>   
>   	bo->flags = flags;
> +	bo->vram_flags = 0;
>   	/* PCI GART is always snooped */
>   	if (!(rdev->flags & RADEON_IS_PCIE))
>   		bo->flags &= ~(RADEON_GEM_GTT_WC | RADEON_GEM_GTT_UC);
> diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c
> index d07ff84..a8743bd 100644
> --- a/drivers/gpu/drm/radeon/radeon_ttm.c
> +++ b/drivers/gpu/drm/radeon/radeon_ttm.c
> @@ -440,6 +440,19 @@ static int radeon_bo_move(struct ttm_buffer_object *bo,
>   		r = radeon_move_ram_vram(bo, evict, interruptible,
>   					    no_wait_gpu, new_mem);
>   	} else {
> +		/* Clear WC flag when moving bo from vram to gtt. */
> +		if (old_mem->mem_type == TTM_PL_VRAM && new_mem->mem_type == TTM_PL_TT) {
> +			if (rbo->flags & RADEON_GEM_GTT_WC) {
> +				rbo->vram_flags |= RADEON_GEM_GTT_WC;
> +				rbo->flags &= ~RADEON_GEM_GTT_WC;
> +			}
> +		/* Re-add WC flag when moving back from gtt to vram. */
> +		} else if (old_mem->mem_type == TTM_PL_TT && new_mem->mem_type == TTM_PL_VRAM) {
> +			if (rbo->vram_flags & RADEON_GEM_GTT_WC) {
> +				rbo->flags |= RADEON_GEM_GTT_WC;
> +				rbo->vram_flags &= ~RADEON_GEM_GTT_WC;
> +			}
> +		}
>   		r = radeon_move_blit(bo, evict, no_wait_gpu, new_mem, old_mem);
>   	}
>
Julien Isorce April 24, 2017, 10:01 a.m. UTC | #2
On 24 April 2017 at 10:51, Christian König <deathsimple@vodafone.de> wrote:

> Am 24.04.2017 um 11:42 schrieb Julien Isorce:
>
>> But re-add the flag is the bo is moved back to vram.
>>
>> This fixes "ring 0/3 stalled" issue which happens when the driver
>> evicts bo from vram to gtt, at least on TAHITI and CAPVERDE.
>>
>
> Interesting find, but NAK on the approach for fixing it.
>

Thx for the comments.


>
> If WC mappings don't work for TAHITI and CAPVERDE we need to figure out
> why or at least disable them for those hardware generations in general.
>

Should I extend
https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/radeon/radeon_object.c?h=amd-staging-4.9#n228
to BONAIRE (which will include VERDE and TAHITI) ? (to match
https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/radeon/radeon_ib.c?h=amd-staging-4.9#n199
 )

>
> Disabling WC for BOs swapped out from VRAM won't buy us much if the BO was
> initially created in GTT anyway.


Initially created in VRAM:
https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/radeon/r600_buffer_common.c#n164


>
>
> Christian.
>
>
>
>> I do not know the exact reason among the following:
>>    - si_copy_dma from vram to gtt is slow if WC
>>         (only for the non-visible part ? specific cases ?)
>>    - Allow snooping (SNOOPED flag from radeon_vm_bo_update).
>>    - WC should not be set at all for bo in the GTT
>>          (same reason why WC is only set for vram domain,
>>           see mesa::r600_init_resource_fields since mesa commit 5b6a0b7
>>               "gallium/radeon: set GTT WC on tiled textures")
>>    - Bug in WC
>>    - Same reason as why radeon_sa_bo_manager_init is not passing
>>      WC flags if older than CHIP_BONAIRE (see 810b73d1
>>          drm/radeon: Use write-combined CPU mappings of IBs on >= CIK)
>>    - Same as b738ca5d
>>          Revert "drm/radeon: Use write-combined CPU mappings of ring ..."
>>    - Same as 96ea47c0
>>          drm/radeon: Disable uncacheable CPU mappings of GTT with RV6xx
>>          see https://bugs.freedesktop.org/show_bug.cgi?id=91268#c2
>>
>> https://bugs.freedesktop.org/show_bug.cgi?id=100712
>> ---
>>   drivers/gpu/drm/radeon/radeon.h        |  1 +
>>   drivers/gpu/drm/radeon/radeon_object.c |  1 +
>>   drivers/gpu/drm/radeon/radeon_ttm.c    | 13 +++++++++++++
>>   3 files changed, 15 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/radeon/radeon.h
>> b/drivers/gpu/drm/radeon/radeon.h
>> index 7a39a35..9847f4e 100644
>> --- a/drivers/gpu/drm/radeon/radeon.h
>> +++ b/drivers/gpu/drm/radeon/radeon.h
>> @@ -518,6 +518,7 @@ struct radeon_bo {
>>         struct radeon_mn                *mn;
>>         struct list_head                mn_list;
>> +       u32                             vram_flags;
>>   };
>>   #define gem_to_radeon_bo(gobj) container_of((gobj), struct radeon_bo,
>> gem_base)
>>   diff --git a/drivers/gpu/drm/radeon/radeon_object.c
>> b/drivers/gpu/drm/radeon/radeon_object.c
>> index a557869..870f6b0 100644
>> --- a/drivers/gpu/drm/radeon/radeon_object.c
>> +++ b/drivers/gpu/drm/radeon/radeon_object.c
>> @@ -228,6 +228,7 @@ int radeon_bo_create(struct radeon_device *rdev,
>>                                        RADEON_GEM_DOMAIN_CPU);
>>         bo->flags = flags;
>> +       bo->vram_flags = 0;
>>         /* PCI GART is always snooped */
>>         if (!(rdev->flags & RADEON_IS_PCIE))
>>                 bo->flags &= ~(RADEON_GEM_GTT_WC | RADEON_GEM_GTT_UC);
>> diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c
>> b/drivers/gpu/drm/radeon/radeon_ttm.c
>> index d07ff84..a8743bd 100644
>> --- a/drivers/gpu/drm/radeon/radeon_ttm.c
>> +++ b/drivers/gpu/drm/radeon/radeon_ttm.c
>> @@ -440,6 +440,19 @@ static int radeon_bo_move(struct ttm_buffer_object
>> *bo,
>>                 r = radeon_move_ram_vram(bo, evict, interruptible,
>>                                             no_wait_gpu, new_mem);
>>         } else {
>> +               /* Clear WC flag when moving bo from vram to gtt. */
>> +               if (old_mem->mem_type == TTM_PL_VRAM && new_mem->mem_type
>> == TTM_PL_TT) {
>> +                       if (rbo->flags & RADEON_GEM_GTT_WC) {
>> +                               rbo->vram_flags |= RADEON_GEM_GTT_WC;
>> +                               rbo->flags &= ~RADEON_GEM_GTT_WC;
>> +                       }
>> +               /* Re-add WC flag when moving back from gtt to vram. */
>> +               } else if (old_mem->mem_type == TTM_PL_TT &&
>> new_mem->mem_type == TTM_PL_VRAM) {
>> +                       if (rbo->vram_flags & RADEON_GEM_GTT_WC) {
>> +                               rbo->flags |= RADEON_GEM_GTT_WC;
>> +                               rbo->vram_flags &= ~RADEON_GEM_GTT_WC;
>> +                       }
>> +               }
>>                 r = radeon_move_blit(bo, evict, no_wait_gpu, new_mem,
>> old_mem);
>>         }
>>
>>
>
>
>
Michel Dänzer April 24, 2017, 10:06 a.m. UTC | #3
On 24/04/17 06:51 PM, Christian König wrote:
> Am 24.04.2017 um 11:42 schrieb Julien Isorce:
>> But re-add the flag is the bo is moved back to vram.
>>
>> This fixes "ring 0/3 stalled" issue which happens when the driver
>> evicts bo from vram to gtt, at least on TAHITI and CAPVERDE.
> 
> Interesting find, but NAK on the approach for fixing it.
> 
> If WC mappings don't work for TAHITI and CAPVERDE we need to figure out
> why or at least disable them for those hardware generations in general.
> 
> Disabling WC for BOs swapped out from VRAM won't buy us much if the BO
> was initially created in GTT anyway.

Moreover, RADEON_GEM_GTT_WC shouldn't have any effect at all for a BO
which is currently in VRAM, so it's not clear how the patch makes any
difference. I suspect it might accidentally cause RADEON_GEM_GTT_WC to
be ignored altogether in radeon_vm_bo_update and/or
radeon_ttm_placement_from_domain.
Michel Dänzer April 24, 2017, 10:08 a.m. UTC | #4
On 24/04/17 07:01 PM, Julien Isorce wrote:
> 
> On 24 April 2017 at 10:51, Christian König <deathsimple@vodafone.de
> <mailto:deathsimple@vodafone.de>> wrote:
> 
>     Am 24.04.2017 um 11:42 schrieb Julien Isorce:
> 
>         But re-add the flag is the bo is moved back to vram.
> 
>         This fixes "ring 0/3 stalled" issue which happens when the driver
>         evicts bo from vram to gtt, at least on TAHITI and CAPVERDE.
> 
> 
>     Interesting find, but NAK on the approach for fixing it.
> 
> 
> Thx for the comments.
>  
> 
> 
>     If WC mappings don't work for TAHITI and CAPVERDE we need to figure
>     out why or at least disable them for those hardware generations in
>     general.
> 
> 
> Should I extend
> https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/radeon/radeon_object.c?h=amd-staging-4.9#n228
> to BONAIRE (which will include VERDE and TAHITI) ? (to match
> https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/radeon/radeon_ib.c?h=amd-staging-4.9#n199 )

Not sure the issue is widespread / systemic enough to justify that
upstream. You can do whatever you deem appropriate in your project, of
course.
diff mbox

Patch

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 7a39a35..9847f4e 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -518,6 +518,7 @@  struct radeon_bo {
 
 	struct radeon_mn		*mn;
 	struct list_head		mn_list;
+	u32				vram_flags;
 };
 #define gem_to_radeon_bo(gobj) container_of((gobj), struct radeon_bo, gem_base)
 
diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
index a557869..870f6b0 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -228,6 +228,7 @@  int radeon_bo_create(struct radeon_device *rdev,
 				       RADEON_GEM_DOMAIN_CPU);
 
 	bo->flags = flags;
+	bo->vram_flags = 0;
 	/* PCI GART is always snooped */
 	if (!(rdev->flags & RADEON_IS_PCIE))
 		bo->flags &= ~(RADEON_GEM_GTT_WC | RADEON_GEM_GTT_UC);
diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c
index d07ff84..a8743bd 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -440,6 +440,19 @@  static int radeon_bo_move(struct ttm_buffer_object *bo,
 		r = radeon_move_ram_vram(bo, evict, interruptible,
 					    no_wait_gpu, new_mem);
 	} else {
+		/* Clear WC flag when moving bo from vram to gtt. */
+		if (old_mem->mem_type == TTM_PL_VRAM && new_mem->mem_type == TTM_PL_TT) {
+			if (rbo->flags & RADEON_GEM_GTT_WC) {
+				rbo->vram_flags |= RADEON_GEM_GTT_WC;
+				rbo->flags &= ~RADEON_GEM_GTT_WC;
+			}
+		/* Re-add WC flag when moving back from gtt to vram. */
+		} else if (old_mem->mem_type == TTM_PL_TT && new_mem->mem_type == TTM_PL_VRAM) {
+			if (rbo->vram_flags & RADEON_GEM_GTT_WC) {
+				rbo->flags |= RADEON_GEM_GTT_WC;
+				rbo->vram_flags &= ~RADEON_GEM_GTT_WC;
+			}
+		}
 		r = radeon_move_blit(bo, evict, no_wait_gpu, new_mem, old_mem);
 	}