Message ID | CADnq5_P2F5BnwBiNjXYLWJsXeSw8FWLHOq_MfLpk50TswqMKXQ@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
So what's difference between WRITE_DATA with PFP vs ME? Would it also be preferable for DMA_DATA and COPY_DATA? Marek On Thu, Aug 7, 2014 at 3:59 PM, Alex Deucher <alexdeucher@gmail.com> wrote: > On Thu, Aug 7, 2014 at 3:46 AM, Michel Dänzer <michel@daenzer.net> wrote: >> From: Michel Dänzer <michel.daenzer@amd.com> >> >> Not doing this causes piglit hangs[0] on my Cape Verde card. No issues on >> Bonaire and Kaveri though. >> >> [0] Same symptoms as those fixed on CIK by 'drm/radeon: set VM base addr >> using the PFP v2'. >> >> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> > > We should be using PFP as much as possible. Does the attached patch help? > > Alex > >> --- >> drivers/gpu/drm/radeon/radeon_vm.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/radeon/radeon_vm.c b/drivers/gpu/drm/radeon/radeon_vm.c >> index ccae4d9..898cbb7 100644 >> --- a/drivers/gpu/drm/radeon/radeon_vm.c >> +++ b/drivers/gpu/drm/radeon/radeon_vm.c >> @@ -238,7 +238,9 @@ void radeon_vm_flush(struct radeon_device *rdev, >> uint64_t pd_addr = radeon_bo_gpu_offset(vm->page_directory); >> >> /* if we can't remember our last VM flush then flush now! */ >> - if (!vm->last_flush || pd_addr != vm->pd_gpu_addr) { >> + /* XXX figure out why we have to flush all the time before CIK */ >> + if (rdev->family < CHIP_BONAIRE || >> + !vm->last_flush || pd_addr != vm->pd_gpu_addr) { >> trace_radeon_vm_flush(pd_addr, ring, vm->id); >> vm->pd_gpu_addr = pd_addr; >> radeon_ring_vm_flush(rdev, ring, vm); >> -- >> 2.0.1 >> >> _______________________________________________ >> dri-devel mailing list >> dri-devel@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/dri-devel > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel >
The GFX CP is split up into two different engines - the PFP and the ME (starting with SI you additionally get the CE as well). The PFP is responsible for reading commands out of memory and forwarding it to the ME (or the CE). Some commands can be executed on the PFP as well, like simple register writes, but most commands can only run on the ME. The PFP and the ME are connected through a 8 entry ring buffer (IIRC), so when you do something on the ME the PFP depends on you need to block the PFP for the ME to finish it's operation. It strongly depends on what we want to do if we should use the PFP or the ME, but in most cases (like writing to memory) it's only the ME that can do the operation anyway. Regards, Christian. Am 07.08.2014 um 17:38 schrieb Marek Olšák: > So what's difference between WRITE_DATA with PFP vs ME? Would it also > be preferable for DMA_DATA and COPY_DATA? > > Marek > > On Thu, Aug 7, 2014 at 3:59 PM, Alex Deucher <alexdeucher@gmail.com> wrote: >> On Thu, Aug 7, 2014 at 3:46 AM, Michel Dänzer <michel@daenzer.net> wrote: >>> From: Michel Dänzer <michel.daenzer@amd.com> >>> >>> Not doing this causes piglit hangs[0] on my Cape Verde card. No issues on >>> Bonaire and Kaveri though. >>> >>> [0] Same symptoms as those fixed on CIK by 'drm/radeon: set VM base addr >>> using the PFP v2'. >>> >>> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> >> We should be using PFP as much as possible. Does the attached patch help? >> >> Alex >> >>> --- >>> drivers/gpu/drm/radeon/radeon_vm.c | 4 +++- >>> 1 file changed, 3 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/gpu/drm/radeon/radeon_vm.c b/drivers/gpu/drm/radeon/radeon_vm.c >>> index ccae4d9..898cbb7 100644 >>> --- a/drivers/gpu/drm/radeon/radeon_vm.c >>> +++ b/drivers/gpu/drm/radeon/radeon_vm.c >>> @@ -238,7 +238,9 @@ void radeon_vm_flush(struct radeon_device *rdev, >>> uint64_t pd_addr = radeon_bo_gpu_offset(vm->page_directory); >>> >>> /* if we can't remember our last VM flush then flush now! */ >>> - if (!vm->last_flush || pd_addr != vm->pd_gpu_addr) { >>> + /* XXX figure out why we have to flush all the time before CIK */ >>> + if (rdev->family < CHIP_BONAIRE || >>> + !vm->last_flush || pd_addr != vm->pd_gpu_addr) { >>> trace_radeon_vm_flush(pd_addr, ring, vm->id); >>> vm->pd_gpu_addr = pd_addr; >>> radeon_ring_vm_flush(rdev, ring, vm); >>> -- >>> 2.0.1 >>> >>> _______________________________________________ >>> dri-devel mailing list >>> dri-devel@lists.freedesktop.org >>> http://lists.freedesktop.org/mailman/listinfo/dri-devel >> _______________________________________________ >> dri-devel mailing list >> dri-devel@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/dri-devel >> > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel
On Thu, Aug 7, 2014 at 11:38 AM, Marek Olšák <maraeo@gmail.com> wrote: > So what's difference between WRITE_DATA with PFP vs ME? Would it also > be preferable for DMA_DATA and COPY_DATA? The PFP comes before the ME in the pipeline. Note that there is no PFP (or CE) on the compute queues so we can't use PFP (or CE) for compute. According to the internal gfx teams, we should use PFP whenever possible since the PFP is rarely as busy as the ME. Note also that the engine bit is not always consistent (for some packets 0 = ME, 1 = PFP and for others 1= ME and 0 = PFP). Alex > > Marek > > On Thu, Aug 7, 2014 at 3:59 PM, Alex Deucher <alexdeucher@gmail.com> wrote: >> On Thu, Aug 7, 2014 at 3:46 AM, Michel Dänzer <michel@daenzer.net> wrote: >>> From: Michel Dänzer <michel.daenzer@amd.com> >>> >>> Not doing this causes piglit hangs[0] on my Cape Verde card. No issues on >>> Bonaire and Kaveri though. >>> >>> [0] Same symptoms as those fixed on CIK by 'drm/radeon: set VM base addr >>> using the PFP v2'. >>> >>> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> >> >> We should be using PFP as much as possible. Does the attached patch help? >> >> Alex >> >>> --- >>> drivers/gpu/drm/radeon/radeon_vm.c | 4 +++- >>> 1 file changed, 3 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/gpu/drm/radeon/radeon_vm.c b/drivers/gpu/drm/radeon/radeon_vm.c >>> index ccae4d9..898cbb7 100644 >>> --- a/drivers/gpu/drm/radeon/radeon_vm.c >>> +++ b/drivers/gpu/drm/radeon/radeon_vm.c >>> @@ -238,7 +238,9 @@ void radeon_vm_flush(struct radeon_device *rdev, >>> uint64_t pd_addr = radeon_bo_gpu_offset(vm->page_directory); >>> >>> /* if we can't remember our last VM flush then flush now! */ >>> - if (!vm->last_flush || pd_addr != vm->pd_gpu_addr) { >>> + /* XXX figure out why we have to flush all the time before CIK */ >>> + if (rdev->family < CHIP_BONAIRE || >>> + !vm->last_flush || pd_addr != vm->pd_gpu_addr) { >>> trace_radeon_vm_flush(pd_addr, ring, vm->id); >>> vm->pd_gpu_addr = pd_addr; >>> radeon_ring_vm_flush(rdev, ring, vm); >>> -- >>> 2.0.1 >>> >>> _______________________________________________ >>> dri-devel mailing list >>> dri-devel@lists.freedesktop.org >>> http://lists.freedesktop.org/mailman/listinfo/dri-devel >> >> _______________________________________________ >> dri-devel mailing list >> dri-devel@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/dri-devel >>
On 08.08.2014 00:55, Alex Deucher wrote: > > Note that there is no PFP (or CE) on the compute queues so we can't > use PFP (or CE) for compute. AFAICT cik_hdp_flush_cp_ring_emit() always uses the PFP though. > Note also that the engine bit is not always consistent (for some packets 0 > = ME, 1 = PFP and for others 1= ME and 0 = PFP). Ugh. Then we should probably use explicit *_ENGINE_PFP/ME macros instead of *_ENGINE(lucky_number). :) >> On Thu, Aug 7, 2014 at 3:59 PM, Alex Deucher <alexdeucher@gmail.com> wrote: >>> >>> We should be using PFP as much as possible. Does the attached patch help? Unfortunately not.
>>> On Thu, Aug 7, 2014 at 3:59 PM, Alex Deucher <alexdeucher@gmail.com> wrote: >>>> We should be using PFP as much as possible. Does the attached patch help? > Unfortunately not. Maybe add a readback of the VM base addr pointer to make sure that the write has really reached the SBRM? Otherwise I'm out of ideas as well, Christian. Am 08.08.2014 um 04:38 schrieb Michel Dänzer: > On 08.08.2014 00:55, Alex Deucher wrote: >> Note that there is no PFP (or CE) on the compute queues so we can't >> use PFP (or CE) for compute. > AFAICT cik_hdp_flush_cp_ring_emit() always uses the PFP though. > > >> Note also that the engine bit is not always consistent (for some packets 0 >> = ME, 1 = PFP and for others 1= ME and 0 = PFP). > Ugh. Then we should probably use explicit *_ENGINE_PFP/ME macros instead > of *_ENGINE(lucky_number). :) > > >>> On Thu, Aug 7, 2014 at 3:59 PM, Alex Deucher <alexdeucher@gmail.com> wrote: >>>> We should be using PFP as much as possible. Does the attached patch help? > Unfortunately not. > >
On 08.08.2014 17:44, Christian König wrote: >>>> On Thu, Aug 7, 2014 at 3:59 PM, Alex Deucher <alexdeucher@gmail.com> >>>> wrote: >>>>> We should be using PFP as much as possible. Does the attached >>>>> patch help? >> Unfortunately not. > > Maybe add a readback of the VM base addr pointer to make sure that the > write has really reached the SBRM? I'm not sure what exactly you're thinking of, but I'm happy to test any patches you guys come up with. :)
From e58fc941419a1be461cd202a337a9d7baf11fc36 Mon Sep 17 00:00:00 2001 From: Alex Deucher <alexander.deucher@amd.com> Date: Thu, 7 Aug 2014 09:57:21 -0400 Subject: [PATCH] drm/radeon: use pfp for all vm_flush related updates May fix hangs in some cases. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org --- drivers/gpu/drm/radeon/cik.c | 8 ++++---- drivers/gpu/drm/radeon/si.c | 4 ++-- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c index b625646..e7d99e1 100644 --- a/drivers/gpu/drm/radeon/cik.c +++ b/drivers/gpu/drm/radeon/cik.c @@ -5958,14 +5958,14 @@ void cik_vm_flush(struct radeon_device *rdev, int ridx, struct radeon_vm *vm) /* update SH_MEM_* regs */ radeon_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 3)); - radeon_ring_write(ring, (WRITE_DATA_ENGINE_SEL(0) | + radeon_ring_write(ring, (WRITE_DATA_ENGINE_SEL(usepfp) | WRITE_DATA_DST_SEL(0))); radeon_ring_write(ring, SRBM_GFX_CNTL >> 2); radeon_ring_write(ring, 0); radeon_ring_write(ring, VMID(vm->id)); radeon_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 6)); - radeon_ring_write(ring, (WRITE_DATA_ENGINE_SEL(0) | + radeon_ring_write(ring, (WRITE_DATA_ENGINE_SEL(usepfp) | WRITE_DATA_DST_SEL(0))); radeon_ring_write(ring, SH_MEM_BASES >> 2); radeon_ring_write(ring, 0); @@ -5976,7 +5976,7 @@ void cik_vm_flush(struct radeon_device *rdev, int ridx, struct radeon_vm *vm) radeon_ring_write(ring, 0); /* SH_MEM_APE1_LIMIT */ radeon_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 3)); - radeon_ring_write(ring, (WRITE_DATA_ENGINE_SEL(0) | + radeon_ring_write(ring, (WRITE_DATA_ENGINE_SEL(usepfp) | WRITE_DATA_DST_SEL(0))); radeon_ring_write(ring, SRBM_GFX_CNTL >> 2); radeon_ring_write(ring, 0); @@ -5987,7 +5987,7 @@ void cik_vm_flush(struct radeon_device *rdev, int ridx, struct radeon_vm *vm) /* bits 0-15 are the VM contexts0-15 */ radeon_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 3)); - radeon_ring_write(ring, (WRITE_DATA_ENGINE_SEL(0) | + radeon_ring_write(ring, (WRITE_DATA_ENGINE_SEL(usepfp) | WRITE_DATA_DST_SEL(0))); radeon_ring_write(ring, VM_INVALIDATE_REQUEST >> 2); radeon_ring_write(ring, 0); diff --git a/drivers/gpu/drm/radeon/si.c b/drivers/gpu/drm/radeon/si.c index 011779b..dbd9d81 100644 --- a/drivers/gpu/drm/radeon/si.c +++ b/drivers/gpu/drm/radeon/si.c @@ -5028,7 +5028,7 @@ void si_vm_flush(struct radeon_device *rdev, int ridx, struct radeon_vm *vm) /* flush hdp cache */ radeon_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 3)); - radeon_ring_write(ring, (WRITE_DATA_ENGINE_SEL(0) | + radeon_ring_write(ring, (WRITE_DATA_ENGINE_SEL(1) | WRITE_DATA_DST_SEL(0))); radeon_ring_write(ring, HDP_MEM_COHERENCY_FLUSH_CNTL >> 2); radeon_ring_write(ring, 0); @@ -5036,7 +5036,7 @@ void si_vm_flush(struct radeon_device *rdev, int ridx, struct radeon_vm *vm) /* bits 0-15 are the VM contexts0-15 */ radeon_ring_write(ring, PACKET3(PACKET3_WRITE_DATA, 3)); - radeon_ring_write(ring, (WRITE_DATA_ENGINE_SEL(0) | + radeon_ring_write(ring, (WRITE_DATA_ENGINE_SEL(1) | WRITE_DATA_DST_SEL(0))); radeon_ring_write(ring, VM_INVALIDATE_REQUEST >> 2); radeon_ring_write(ring, 0); -- 1.8.3.1