Message ID | 20241024145432.934086-4-akash.goel@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/panthor: Coherency related fixes | expand |
On 10/24/24 16:39, Boris Brezillon wrote: > On Thu, 24 Oct 2024 15:54:32 +0100 > Akash Goel <akash.goel@arm.com> wrote: > >> All CPU mappings are forced as uncached for Panthor buffer objects when >> system(IO) coherency is disabled. Physical backing for Panthor BOs is >> allocated by shmem, which clears the pages also after allocation. But >> there is no explicit cache flush done after the clearing of pages. >> So it could happen that there are dirty cachelines in the CPU cache >> for the BOs, when they are accessed from the CPU side through uncached >> CPU mapping, and the eviction of cachelines overwrites the data of BOs. > > Hm, this looks like something that should be handled at the > drm_gem_shmem level when drm_gem_shmem_object::map_wc=true, as I > suspect other drivers can hit the same issue (I'm thinking of panfrost > and lima, but there might be others). > I am sorry for the late reply. Many thanks for the quick feedback. I assume you also reckon that there is a potential problem here for arm64. Fully agree with your suggestion that the handling needs to be at the drm_gem_shmem level. I was not sure if we really need to do anything, as I didn't observe any overwrite issue during the testing. So thought better to limit the change to Panthor and get some feedback. shmem calls 'flush_dcache_folio()' after clearing the pages but that just clears the 'PG_dcache_clean' bit and CPU cache is not cleaned immediately. I realize that this patch is not foolproof, as Userspace can try to populate the BO from CPU side before mapping it on the GPU side. Not sure if we also need to consider the case when shmem pages are swapped out. Don't know if there could be a similar situation of dirty cachelines after the swap in. Also not sure if dma_sync_sgtable_for_device() can be called from drm_gem_shmem_get_pages() as the sg_table won't be available at that point. Please let me know your thoughts. Best regards Akash >> >> This commit tries to avoid the potential overwrite scenario. >> >> Signed-off-by: Akash Goel <akash.goel@arm.com> >> --- >> drivers/gpu/drm/panthor/panthor_gem.h | 10 ++++++++++ >> drivers/gpu/drm/panthor/panthor_mmu.c | 5 +++++ >> 2 files changed, 15 insertions(+) >> >> diff --git a/drivers/gpu/drm/panthor/panthor_gem.h b/drivers/gpu/drm/panthor/panthor_gem.h >> index e43021cf6d45..4b0f43f1edf1 100644 >> --- a/drivers/gpu/drm/panthor/panthor_gem.h >> +++ b/drivers/gpu/drm/panthor/panthor_gem.h >> @@ -46,6 +46,16 @@ struct panthor_gem_object { >> >> /** @flags: Combination of drm_panthor_bo_flags flags. */ >> u32 flags; >> + >> + /** >> + * @cleaned: The buffer object pages have been cleaned. >> + * >> + * There could be dirty CPU cachelines for the pages of buffer object >> + * after allocation, as shmem will zero out the pages. The cachelines >> + * need to be cleaned if the pages are going to be accessed with an >> + * uncached CPU mapping. >> + */ >> + bool cleaned; >> }; >> >> /** >> diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c >> index f522a116c1b1..d8cc9e7d064e 100644 >> --- a/drivers/gpu/drm/panthor/panthor_mmu.c >> +++ b/drivers/gpu/drm/panthor/panthor_mmu.c >> @@ -1249,6 +1249,11 @@ static int panthor_vm_prepare_map_op_ctx(struct panthor_vm_op_ctx *op_ctx, >> >> op_ctx->map.sgt = sgt; >> >> + if (bo->base.map_wc && !bo->cleaned) { >> + dma_sync_sgtable_for_device(vm->ptdev->base.dev, sgt, DMA_TO_DEVICE); >> + bo->cleaned = true; >> + } >> + >> preallocated_vm_bo = drm_gpuvm_bo_create(&vm->base, &bo->base.base); >> if (!preallocated_vm_bo) { >> if (!bo->base.base.import_attach) >
diff --git a/drivers/gpu/drm/panthor/panthor_gem.h b/drivers/gpu/drm/panthor/panthor_gem.h index e43021cf6d45..4b0f43f1edf1 100644 --- a/drivers/gpu/drm/panthor/panthor_gem.h +++ b/drivers/gpu/drm/panthor/panthor_gem.h @@ -46,6 +46,16 @@ struct panthor_gem_object { /** @flags: Combination of drm_panthor_bo_flags flags. */ u32 flags; + + /** + * @cleaned: The buffer object pages have been cleaned. + * + * There could be dirty CPU cachelines for the pages of buffer object + * after allocation, as shmem will zero out the pages. The cachelines + * need to be cleaned if the pages are going to be accessed with an + * uncached CPU mapping. + */ + bool cleaned; }; /** diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c index f522a116c1b1..d8cc9e7d064e 100644 --- a/drivers/gpu/drm/panthor/panthor_mmu.c +++ b/drivers/gpu/drm/panthor/panthor_mmu.c @@ -1249,6 +1249,11 @@ static int panthor_vm_prepare_map_op_ctx(struct panthor_vm_op_ctx *op_ctx, op_ctx->map.sgt = sgt; + if (bo->base.map_wc && !bo->cleaned) { + dma_sync_sgtable_for_device(vm->ptdev->base.dev, sgt, DMA_TO_DEVICE); + bo->cleaned = true; + } + preallocated_vm_bo = drm_gpuvm_bo_create(&vm->base, &bo->base.base); if (!preallocated_vm_bo) { if (!bo->base.base.import_attach)
All CPU mappings are forced as uncached for Panthor buffer objects when system(IO) coherency is disabled. Physical backing for Panthor BOs is allocated by shmem, which clears the pages also after allocation. But there is no explicit cache flush done after the clearing of pages. So it could happen that there are dirty cachelines in the CPU cache for the BOs, when they are accessed from the CPU side through uncached CPU mapping, and the eviction of cachelines overwrites the data of BOs. This commit tries to avoid the potential overwrite scenario. Signed-off-by: Akash Goel <akash.goel@arm.com> --- drivers/gpu/drm/panthor/panthor_gem.h | 10 ++++++++++ drivers/gpu/drm/panthor/panthor_mmu.c | 5 +++++ 2 files changed, 15 insertions(+)