Message ID | 1493026928-30134-1-git-send-email-jisorce@oblong.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Am 24.04.2017 um 11:42 schrieb Julien Isorce: > But re-add the flag is the bo is moved back to vram. > > This fixes "ring 0/3 stalled" issue which happens when the driver > evicts bo from vram to gtt, at least on TAHITI and CAPVERDE. Interesting find, but NAK on the approach for fixing it. If WC mappings don't work for TAHITI and CAPVERDE we need to figure out why or at least disable them for those hardware generations in general. Disabling WC for BOs swapped out from VRAM won't buy us much if the BO was initially created in GTT anyway. Christian. > > I do not know the exact reason among the following: > - si_copy_dma from vram to gtt is slow if WC > (only for the non-visible part ? specific cases ?) > - Allow snooping (SNOOPED flag from radeon_vm_bo_update). > - WC should not be set at all for bo in the GTT > (same reason why WC is only set for vram domain, > see mesa::r600_init_resource_fields since mesa commit 5b6a0b7 > "gallium/radeon: set GTT WC on tiled textures") > - Bug in WC > - Same reason as why radeon_sa_bo_manager_init is not passing > WC flags if older than CHIP_BONAIRE (see 810b73d1 > drm/radeon: Use write-combined CPU mappings of IBs on >= CIK) > - Same as b738ca5d > Revert "drm/radeon: Use write-combined CPU mappings of ring ..." > - Same as 96ea47c0 > drm/radeon: Disable uncacheable CPU mappings of GTT with RV6xx > see https://bugs.freedesktop.org/show_bug.cgi?id=91268#c2 > > https://bugs.freedesktop.org/show_bug.cgi?id=100712 > --- > drivers/gpu/drm/radeon/radeon.h | 1 + > drivers/gpu/drm/radeon/radeon_object.c | 1 + > drivers/gpu/drm/radeon/radeon_ttm.c | 13 +++++++++++++ > 3 files changed, 15 insertions(+) > > diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h > index 7a39a35..9847f4e 100644 > --- a/drivers/gpu/drm/radeon/radeon.h > +++ b/drivers/gpu/drm/radeon/radeon.h > @@ -518,6 +518,7 @@ struct radeon_bo { > > struct radeon_mn *mn; > struct list_head mn_list; > + u32 vram_flags; > }; > #define gem_to_radeon_bo(gobj) container_of((gobj), struct radeon_bo, gem_base) > > diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c > index a557869..870f6b0 100644 > --- a/drivers/gpu/drm/radeon/radeon_object.c > +++ b/drivers/gpu/drm/radeon/radeon_object.c > @@ -228,6 +228,7 @@ int radeon_bo_create(struct radeon_device *rdev, > RADEON_GEM_DOMAIN_CPU); > > bo->flags = flags; > + bo->vram_flags = 0; > /* PCI GART is always snooped */ > if (!(rdev->flags & RADEON_IS_PCIE)) > bo->flags &= ~(RADEON_GEM_GTT_WC | RADEON_GEM_GTT_UC); > diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c > index d07ff84..a8743bd 100644 > --- a/drivers/gpu/drm/radeon/radeon_ttm.c > +++ b/drivers/gpu/drm/radeon/radeon_ttm.c > @@ -440,6 +440,19 @@ static int radeon_bo_move(struct ttm_buffer_object *bo, > r = radeon_move_ram_vram(bo, evict, interruptible, > no_wait_gpu, new_mem); > } else { > + /* Clear WC flag when moving bo from vram to gtt. */ > + if (old_mem->mem_type == TTM_PL_VRAM && new_mem->mem_type == TTM_PL_TT) { > + if (rbo->flags & RADEON_GEM_GTT_WC) { > + rbo->vram_flags |= RADEON_GEM_GTT_WC; > + rbo->flags &= ~RADEON_GEM_GTT_WC; > + } > + /* Re-add WC flag when moving back from gtt to vram. */ > + } else if (old_mem->mem_type == TTM_PL_TT && new_mem->mem_type == TTM_PL_VRAM) { > + if (rbo->vram_flags & RADEON_GEM_GTT_WC) { > + rbo->flags |= RADEON_GEM_GTT_WC; > + rbo->vram_flags &= ~RADEON_GEM_GTT_WC; > + } > + } > r = radeon_move_blit(bo, evict, no_wait_gpu, new_mem, old_mem); > } >
On 24 April 2017 at 10:51, Christian König <deathsimple@vodafone.de> wrote: > Am 24.04.2017 um 11:42 schrieb Julien Isorce: > >> But re-add the flag is the bo is moved back to vram. >> >> This fixes "ring 0/3 stalled" issue which happens when the driver >> evicts bo from vram to gtt, at least on TAHITI and CAPVERDE. >> > > Interesting find, but NAK on the approach for fixing it. > Thx for the comments. > > If WC mappings don't work for TAHITI and CAPVERDE we need to figure out > why or at least disable them for those hardware generations in general. > Should I extend https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/radeon/radeon_object.c?h=amd-staging-4.9#n228 to BONAIRE (which will include VERDE and TAHITI) ? (to match https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/radeon/radeon_ib.c?h=amd-staging-4.9#n199 ) > > Disabling WC for BOs swapped out from VRAM won't buy us much if the BO was > initially created in GTT anyway. Initially created in VRAM: https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/radeon/r600_buffer_common.c#n164 > > > Christian. > > > >> I do not know the exact reason among the following: >> - si_copy_dma from vram to gtt is slow if WC >> (only for the non-visible part ? specific cases ?) >> - Allow snooping (SNOOPED flag from radeon_vm_bo_update). >> - WC should not be set at all for bo in the GTT >> (same reason why WC is only set for vram domain, >> see mesa::r600_init_resource_fields since mesa commit 5b6a0b7 >> "gallium/radeon: set GTT WC on tiled textures") >> - Bug in WC >> - Same reason as why radeon_sa_bo_manager_init is not passing >> WC flags if older than CHIP_BONAIRE (see 810b73d1 >> drm/radeon: Use write-combined CPU mappings of IBs on >= CIK) >> - Same as b738ca5d >> Revert "drm/radeon: Use write-combined CPU mappings of ring ..." >> - Same as 96ea47c0 >> drm/radeon: Disable uncacheable CPU mappings of GTT with RV6xx >> see https://bugs.freedesktop.org/show_bug.cgi?id=91268#c2 >> >> https://bugs.freedesktop.org/show_bug.cgi?id=100712 >> --- >> drivers/gpu/drm/radeon/radeon.h | 1 + >> drivers/gpu/drm/radeon/radeon_object.c | 1 + >> drivers/gpu/drm/radeon/radeon_ttm.c | 13 +++++++++++++ >> 3 files changed, 15 insertions(+) >> >> diff --git a/drivers/gpu/drm/radeon/radeon.h >> b/drivers/gpu/drm/radeon/radeon.h >> index 7a39a35..9847f4e 100644 >> --- a/drivers/gpu/drm/radeon/radeon.h >> +++ b/drivers/gpu/drm/radeon/radeon.h >> @@ -518,6 +518,7 @@ struct radeon_bo { >> struct radeon_mn *mn; >> struct list_head mn_list; >> + u32 vram_flags; >> }; >> #define gem_to_radeon_bo(gobj) container_of((gobj), struct radeon_bo, >> gem_base) >> diff --git a/drivers/gpu/drm/radeon/radeon_object.c >> b/drivers/gpu/drm/radeon/radeon_object.c >> index a557869..870f6b0 100644 >> --- a/drivers/gpu/drm/radeon/radeon_object.c >> +++ b/drivers/gpu/drm/radeon/radeon_object.c >> @@ -228,6 +228,7 @@ int radeon_bo_create(struct radeon_device *rdev, >> RADEON_GEM_DOMAIN_CPU); >> bo->flags = flags; >> + bo->vram_flags = 0; >> /* PCI GART is always snooped */ >> if (!(rdev->flags & RADEON_IS_PCIE)) >> bo->flags &= ~(RADEON_GEM_GTT_WC | RADEON_GEM_GTT_UC); >> diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c >> b/drivers/gpu/drm/radeon/radeon_ttm.c >> index d07ff84..a8743bd 100644 >> --- a/drivers/gpu/drm/radeon/radeon_ttm.c >> +++ b/drivers/gpu/drm/radeon/radeon_ttm.c >> @@ -440,6 +440,19 @@ static int radeon_bo_move(struct ttm_buffer_object >> *bo, >> r = radeon_move_ram_vram(bo, evict, interruptible, >> no_wait_gpu, new_mem); >> } else { >> + /* Clear WC flag when moving bo from vram to gtt. */ >> + if (old_mem->mem_type == TTM_PL_VRAM && new_mem->mem_type >> == TTM_PL_TT) { >> + if (rbo->flags & RADEON_GEM_GTT_WC) { >> + rbo->vram_flags |= RADEON_GEM_GTT_WC; >> + rbo->flags &= ~RADEON_GEM_GTT_WC; >> + } >> + /* Re-add WC flag when moving back from gtt to vram. */ >> + } else if (old_mem->mem_type == TTM_PL_TT && >> new_mem->mem_type == TTM_PL_VRAM) { >> + if (rbo->vram_flags & RADEON_GEM_GTT_WC) { >> + rbo->flags |= RADEON_GEM_GTT_WC; >> + rbo->vram_flags &= ~RADEON_GEM_GTT_WC; >> + } >> + } >> r = radeon_move_blit(bo, evict, no_wait_gpu, new_mem, >> old_mem); >> } >> >> > > >
On 24/04/17 06:51 PM, Christian König wrote: > Am 24.04.2017 um 11:42 schrieb Julien Isorce: >> But re-add the flag is the bo is moved back to vram. >> >> This fixes "ring 0/3 stalled" issue which happens when the driver >> evicts bo from vram to gtt, at least on TAHITI and CAPVERDE. > > Interesting find, but NAK on the approach for fixing it. > > If WC mappings don't work for TAHITI and CAPVERDE we need to figure out > why or at least disable them for those hardware generations in general. > > Disabling WC for BOs swapped out from VRAM won't buy us much if the BO > was initially created in GTT anyway. Moreover, RADEON_GEM_GTT_WC shouldn't have any effect at all for a BO which is currently in VRAM, so it's not clear how the patch makes any difference. I suspect it might accidentally cause RADEON_GEM_GTT_WC to be ignored altogether in radeon_vm_bo_update and/or radeon_ttm_placement_from_domain.
On 24/04/17 07:01 PM, Julien Isorce wrote: > > On 24 April 2017 at 10:51, Christian König <deathsimple@vodafone.de > <mailto:deathsimple@vodafone.de>> wrote: > > Am 24.04.2017 um 11:42 schrieb Julien Isorce: > > But re-add the flag is the bo is moved back to vram. > > This fixes "ring 0/3 stalled" issue which happens when the driver > evicts bo from vram to gtt, at least on TAHITI and CAPVERDE. > > > Interesting find, but NAK on the approach for fixing it. > > > Thx for the comments. > > > > If WC mappings don't work for TAHITI and CAPVERDE we need to figure > out why or at least disable them for those hardware generations in > general. > > > Should I extend > https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/radeon/radeon_object.c?h=amd-staging-4.9#n228 > to BONAIRE (which will include VERDE and TAHITI) ? (to match > https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/radeon/radeon_ib.c?h=amd-staging-4.9#n199 ) Not sure the issue is widespread / systemic enough to justify that upstream. You can do whatever you deem appropriate in your project, of course.
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 7a39a35..9847f4e 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -518,6 +518,7 @@ struct radeon_bo { struct radeon_mn *mn; struct list_head mn_list; + u32 vram_flags; }; #define gem_to_radeon_bo(gobj) container_of((gobj), struct radeon_bo, gem_base) diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c index a557869..870f6b0 100644 --- a/drivers/gpu/drm/radeon/radeon_object.c +++ b/drivers/gpu/drm/radeon/radeon_object.c @@ -228,6 +228,7 @@ int radeon_bo_create(struct radeon_device *rdev, RADEON_GEM_DOMAIN_CPU); bo->flags = flags; + bo->vram_flags = 0; /* PCI GART is always snooped */ if (!(rdev->flags & RADEON_IS_PCIE)) bo->flags &= ~(RADEON_GEM_GTT_WC | RADEON_GEM_GTT_UC); diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c index d07ff84..a8743bd 100644 --- a/drivers/gpu/drm/radeon/radeon_ttm.c +++ b/drivers/gpu/drm/radeon/radeon_ttm.c @@ -440,6 +440,19 @@ static int radeon_bo_move(struct ttm_buffer_object *bo, r = radeon_move_ram_vram(bo, evict, interruptible, no_wait_gpu, new_mem); } else { + /* Clear WC flag when moving bo from vram to gtt. */ + if (old_mem->mem_type == TTM_PL_VRAM && new_mem->mem_type == TTM_PL_TT) { + if (rbo->flags & RADEON_GEM_GTT_WC) { + rbo->vram_flags |= RADEON_GEM_GTT_WC; + rbo->flags &= ~RADEON_GEM_GTT_WC; + } + /* Re-add WC flag when moving back from gtt to vram. */ + } else if (old_mem->mem_type == TTM_PL_TT && new_mem->mem_type == TTM_PL_VRAM) { + if (rbo->vram_flags & RADEON_GEM_GTT_WC) { + rbo->flags |= RADEON_GEM_GTT_WC; + rbo->vram_flags &= ~RADEON_GEM_GTT_WC; + } + } r = radeon_move_blit(bo, evict, no_wait_gpu, new_mem, old_mem); }