diff mbox

[v2,1/4] drm/i915: Fix false-positive assert_rpm_wakelock_held in i915_pmic_bus_access_notifier

Message ID 20170706192450.28477-1-hdegoede@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Hans de Goede July 6, 2017, 7:24 p.m. UTC
assert_rpm_wakelock_held is triggered from i915_pmic_bus_access_notifier
even though it gets unregistered on (runtime) suspend, this is caused
by a race happening under the following circumstances:

intel_runtime_pm_put does:

   atomic_dec(&dev_priv->pm.wakeref_count);

   pm_runtime_mark_last_busy(kdev);
   pm_runtime_put_autosuspend(kdev);

And pm_runtime_put_autosuspend calls intel_runtime_suspend from
a workqueue, so there is ample of time between the atomic_dec() and
intel_runtime_suspend() unregistering the notifier. If the notifier
gets called in this windowd assert_rpm_wakelock_held falsely triggers
(at this point we're not runtime-suspended yet).

This commit adds disable_rpm_wakeref_asserts and
enable_rpm_wakeref_asserts calls around the
intel_uncore_forcewake_get(FORCEWAKE_ALL) call in
i915_pmic_bus_access_notifier fixing the false-positive WARN_ON.

Reported-by: FKr <bugs-freedesktop@ubermail.me>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
Changes in v2:
-Rebase on current (July 6th 2017) drm-next
---
 drivers/gpu/drm/i915/intel_uncore.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Imre Deak July 27, 2017, 2:35 p.m. UTC | #1
Hi,

On Thu, Jul 06, 2017 at 09:24:47PM +0200, Hans de Goede wrote:
> assert_rpm_wakelock_held is triggered from i915_pmic_bus_access_notifier
> even though it gets unregistered on (runtime) suspend, this is caused
> by a race happening under the following circumstances:
> 
> intel_runtime_pm_put does:
> 
>    atomic_dec(&dev_priv->pm.wakeref_count);
> 
>    pm_runtime_mark_last_busy(kdev);
>    pm_runtime_put_autosuspend(kdev);
> 
> And pm_runtime_put_autosuspend calls intel_runtime_suspend from
> a workqueue, so there is ample of time between the atomic_dec() and
> intel_runtime_suspend() unregistering the notifier. If the notifier
> gets called in this windowd assert_rpm_wakelock_held falsely triggers
> (at this point we're not runtime-suspended yet).
> 
> This commit adds disable_rpm_wakeref_asserts and
> enable_rpm_wakeref_asserts calls around the
> intel_uncore_forcewake_get(FORCEWAKE_ALL) call in
> i915_pmic_bus_access_notifier fixing the false-positive WARN_ON.
> 
> Reported-by: FKr <bugs-freedesktop@ubermail.me>
> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
> ---
> Changes in v2:
> -Rebase on current (July 6th 2017) drm-next
> ---
>  drivers/gpu/drm/i915/intel_uncore.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> index 9882724bc2b6..168b28a87f76 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -1171,8 +1171,15 @@ static int i915_pmic_bus_access_notifier(struct notifier_block *nb,
>  		 * bus, which will be busy after this notification, leading to:
>  		 * "render: timed out waiting for forcewake ack request."
>  		 * errors.
> +		 *
> +		 * This notifier may get called between intel_runtime_pm_put()
> +		 * doing atomic_dec(wakeref_count) and intel_runtime_resume()
> +		 * unregistering this notifier, which leads to false-positive
> +		 * assert_rpm_wakelock_held() triggering.

the following would describe better the reason for disabling wakeref asserts.
That is we access the HW without holding a runtime PM reference, but it's ok
here since it's handled as a special case during runtime suspend:

		* The notifier is unregistered during intel_runtime_suspend(),
		* so it's ok to access the HW here without holding an RPM
		* wake reference -> disable wakeref asserts for the time of
		* the access.

With that this looks ok:
Reviewed-by: Imre Deak <imre.deak@intel.com>


>  		 */
> +		disable_rpm_wakeref_asserts(dev_priv);
>  		intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
> +		enable_rpm_wakeref_asserts(dev_priv);
>  		break;
>  	case MBI_PMIC_BUS_ACCESS_END:
>  		intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
> -- 
> 2.13.0
>
Hans de Goede Aug. 14, 2017, 6:48 p.m. UTC | #2
Hi,

On 27-07-17 16:35, Imre Deak wrote:
> Hi,
> 
> On Thu, Jul 06, 2017 at 09:24:47PM +0200, Hans de Goede wrote:
>> assert_rpm_wakelock_held is triggered from i915_pmic_bus_access_notifier
>> even though it gets unregistered on (runtime) suspend, this is caused
>> by a race happening under the following circumstances:
>>
>> intel_runtime_pm_put does:
>>
>>     atomic_dec(&dev_priv->pm.wakeref_count);
>>
>>     pm_runtime_mark_last_busy(kdev);
>>     pm_runtime_put_autosuspend(kdev);
>>
>> And pm_runtime_put_autosuspend calls intel_runtime_suspend from
>> a workqueue, so there is ample of time between the atomic_dec() and
>> intel_runtime_suspend() unregistering the notifier. If the notifier
>> gets called in this windowd assert_rpm_wakelock_held falsely triggers
>> (at this point we're not runtime-suspended yet).
>>
>> This commit adds disable_rpm_wakeref_asserts and
>> enable_rpm_wakeref_asserts calls around the
>> intel_uncore_forcewake_get(FORCEWAKE_ALL) call in
>> i915_pmic_bus_access_notifier fixing the false-positive WARN_ON.
>>
>> Reported-by: FKr <bugs-freedesktop@ubermail.me>
>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>> ---
>> Changes in v2:
>> -Rebase on current (July 6th 2017) drm-next
>> ---
>>   drivers/gpu/drm/i915/intel_uncore.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
>> index 9882724bc2b6..168b28a87f76 100644
>> --- a/drivers/gpu/drm/i915/intel_uncore.c
>> +++ b/drivers/gpu/drm/i915/intel_uncore.c
>> @@ -1171,8 +1171,15 @@ static int i915_pmic_bus_access_notifier(struct notifier_block *nb,
>>   		 * bus, which will be busy after this notification, leading to:
>>   		 * "render: timed out waiting for forcewake ack request."
>>   		 * errors.
>> +		 *
>> +		 * This notifier may get called between intel_runtime_pm_put()
>> +		 * doing atomic_dec(wakeref_count) and intel_runtime_resume()
>> +		 * unregistering this notifier, which leads to false-positive
>> +		 * assert_rpm_wakelock_held() triggering.
> 
> the following would describe better the reason for disabling wakeref asserts.
> That is we access the HW without holding a runtime PM reference, but it's ok
> here since it's handled as a special case during runtime suspend:
> 
> 		* The notifier is unregistered during intel_runtime_suspend(),
> 		* so it's ok to access the HW here without holding an RPM
> 		* wake reference -> disable wakeref asserts for the time of
> 		* the access.
> 
> With that this looks ok:
> Reviewed-by: Imre Deak <imre.deak@intel.com>

Thank you for the review, unfortunately I've been a bit swamped with other
stuff. But I'm catching up now.

I've made the suggest change to the comment and added your
Reviewed-by for the upcoming v3 of this patch.

Regards,

Hans


> 
> 
>>   		 */
>> +		disable_rpm_wakeref_asserts(dev_priv);
>>   		intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
>> +		enable_rpm_wakeref_asserts(dev_priv);
>>   		break;
>>   	case MBI_PMIC_BUS_ACCESS_END:
>>   		intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
>> -- 
>> 2.13.0
>>
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 9882724bc2b6..168b28a87f76 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1171,8 +1171,15 @@  static int i915_pmic_bus_access_notifier(struct notifier_block *nb,
 		 * bus, which will be busy after this notification, leading to:
 		 * "render: timed out waiting for forcewake ack request."
 		 * errors.
+		 *
+		 * This notifier may get called between intel_runtime_pm_put()
+		 * doing atomic_dec(wakeref_count) and intel_runtime_resume()
+		 * unregistering this notifier, which leads to false-positive
+		 * assert_rpm_wakelock_held() triggering.
 		 */
+		disable_rpm_wakeref_asserts(dev_priv);
 		intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+		enable_rpm_wakeref_asserts(dev_priv);
 		break;
 	case MBI_PMIC_BUS_ACCESS_END:
 		intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);