diff mbox series

drm/radeon: Call mmiowb() at the end of radeon_ring_commit()

Message ID 20240220074958.3288170-1-chenhuacai@loongson.cn (mailing list archive)
State New, archived
Headers show
Series drm/radeon: Call mmiowb() at the end of radeon_ring_commit() | expand

Commit Message

Huacai Chen Feb. 20, 2024, 7:49 a.m. UTC
Commit fb24ea52f78e0d595852e ("drivers: Remove explicit invocations of
mmiowb()") remove all mmiowb() in drivers, but it says:

"NOTE: mmiowb() has only ever guaranteed ordering in conjunction with
spin_unlock(). However, pairing each mmiowb() removal in this patch with
the corresponding call to spin_unlock() is not at all trivial, so there
is a small chance that this change may regress any drivers incorrectly
relying on mmiowb() to order MMIO writes between CPUs using lock-free
synchronisation."

The mmio in radeon_ring_commit() is protected by a mutex rather than a
spinlock, but in the mutex fastpath it behaves similar to spinlock and
need a mmiowb() to make sure the wptr is up-to-date for hardware.

Without this, we get such an error when run 'glxgears' on weak ordering
architectures such as LoongArch:

radeon 0000:04:00.0: ring 0 stalled for more than 10324msec
radeon 0000:04:00.0: ring 3 stalled for more than 10240msec
radeon 0000:04:00.0: GPU lockup (current fence id 0x000000000001f412 last fence id 0x000000000001f414 on ring 3)
radeon 0000:04:00.0: GPU lockup (current fence id 0x000000000000f940 last fence id 0x000000000000f941 on ring 0)
radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
radeon 0000:04:00.0: scheduling IB failed (-35).
[drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)

Cc: stable@vger.kernel.org
Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 drivers/gpu/drm/radeon/radeon_ring.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Christian König Feb. 20, 2024, 8:41 a.m. UTC | #1
Am 20.02.24 um 08:49 schrieb Huacai Chen:
> Commit fb24ea52f78e0d595852e ("drivers: Remove explicit invocations of
> mmiowb()") remove all mmiowb() in drivers, but it says:
>
> "NOTE: mmiowb() has only ever guaranteed ordering in conjunction with
> spin_unlock(). However, pairing each mmiowb() removal in this patch with
> the corresponding call to spin_unlock() is not at all trivial, so there
> is a small chance that this change may regress any drivers incorrectly
> relying on mmiowb() to order MMIO writes between CPUs using lock-free
> synchronisation."
>
> The mmio in radeon_ring_commit() is protected by a mutex rather than a
> spinlock, but in the mutex fastpath it behaves similar to spinlock and
> need a mmiowb() to make sure the wptr is up-to-date for hardware.

Well, if your hw platform can't guarantee that MMIO writes are ordered 
then I would say you can't use radeon in the first place since this is a 
mandatory prerequisite for correct hw behavior.

Doing this here as a workaround is just the tip of the iceberg and 
doesn't really fix the underlying problem.

I strongly suggest to change your writel() implementation to include an 
mmiowb() instead. If that is to heavy weight than at least the mutex 
handling should be changed instead of adding platform specific 
workarounds to a platform independent driver.

Regards,
Christian.

>
> Without this, we get such an error when run 'glxgears' on weak ordering
> architectures such as LoongArch:
>
> radeon 0000:04:00.0: ring 0 stalled for more than 10324msec
> radeon 0000:04:00.0: ring 3 stalled for more than 10240msec
> radeon 0000:04:00.0: GPU lockup (current fence id 0x000000000001f412 last fence id 0x000000000001f414 on ring 3)
> radeon 0000:04:00.0: GPU lockup (current fence id 0x000000000000f940 last fence id 0x000000000000f941 on ring 0)
> radeon 0000:04:00.0: scheduling IB failed (-35).
> [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
> radeon 0000:04:00.0: scheduling IB failed (-35).
> [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
> radeon 0000:04:00.0: scheduling IB failed (-35).
> [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
> radeon 0000:04:00.0: scheduling IB failed (-35).
> [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
> radeon 0000:04:00.0: scheduling IB failed (-35).
> [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
> radeon 0000:04:00.0: scheduling IB failed (-35).
> [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
> radeon 0000:04:00.0: scheduling IB failed (-35).
> [drm:radeon_gem_va_ioctl [radeon]] *ERROR* Couldn't update BO_VA (-35)
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn>
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> ---
>   drivers/gpu/drm/radeon/radeon_ring.c | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c
> index 38048593bb4a..d461dc85d820 100644
> --- a/drivers/gpu/drm/radeon/radeon_ring.c
> +++ b/drivers/gpu/drm/radeon/radeon_ring.c
> @@ -183,6 +183,7 @@ void radeon_ring_commit(struct radeon_device *rdev, struct radeon_ring *ring,
>   	if (hdp_flush && rdev->asic->mmio_hdp_flush)
>   		rdev->asic->mmio_hdp_flush(rdev);
>   	radeon_ring_set_wptr(rdev, ring);
> +	mmiowb(); /* Make sure wptr is up-to-date for hw */
>   }
>   
>   /**
diff mbox series

Patch

diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c
index 38048593bb4a..d461dc85d820 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -183,6 +183,7 @@  void radeon_ring_commit(struct radeon_device *rdev, struct radeon_ring *ring,
 	if (hdp_flush && rdev->asic->mmio_hdp_flush)
 		rdev->asic->mmio_hdp_flush(rdev);
 	radeon_ring_set_wptr(rdev, ring);
+	mmiowb(); /* Make sure wptr is up-to-date for hw */
 }
 
 /**