diff mbox series

drm/i915: Check caller held wakerefs in assert_forcewakes_active

Message ID 20190703121204.30754-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show
Series drm/i915: Check caller held wakerefs in assert_forcewakes_active | expand

Commit Message

Chris Wilson July 3, 2019, 12:12 p.m. UTC
The intent of the assert is to document that the caller took the
appropriate wakerefs for the function. However, as Tvrtko pointed out,
we simply check whether the fw_domains are active and may be confused by
the auto wakeref which may be dropped between the check and use. Let's
be more careful in the assert and check that each fw_domain has an
explicit caller wakeref above and beyond the automatic wakeref.

Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_uncore.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

Comments

Tvrtko Ursulin July 3, 2019, 12:43 p.m. UTC | #1
On 03/07/2019 13:12, Chris Wilson wrote:
> The intent of the assert is to document that the caller took the
> appropriate wakerefs for the function. However, as Tvrtko pointed out,
> we simply check whether the fw_domains are active and may be confused by
> the auto wakeref which may be dropped between the check and use. Let's
> be more careful in the assert and check that each fw_domain has an
> explicit caller wakeref above and beyond the automatic wakeref.

Although we still don't know if it is our caller who took the fw or some 
unrelated thread. Still, a more thorough check is better even if not 
perfect.

> 
> Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/intel_uncore.c | 24 ++++++++++++++++++++++++
>   1 file changed, 24 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> index 68d54e126d79..bc25a6e51463 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -738,6 +738,12 @@ void assert_forcewakes_inactive(struct intel_uncore *uncore)
>   void assert_forcewakes_active(struct intel_uncore *uncore,
>   			      enum forcewake_domains fw_domains)
>   {
> +	struct intel_uncore_forcewake_domain *domain;
> +	unsigned int tmp;
> +
> +	if (!IS_ENABLED(CONFIG_DRM_i915_RUNTIME_PM))
> +		return;
> +

If uncore->funcs.force_wake_get is set why wouldn't we still want to run 
the asserts?

>   	if (!uncore->funcs.force_wake_get)
>   		return;
>   
> @@ -747,6 +753,24 @@ void assert_forcewakes_active(struct intel_uncore *uncore,
>   	WARN(fw_domains & ~uncore->fw_domains_active,
>   	     "Expected %08x fw_domains to be active, but %08x are off\n",
>   	     fw_domains, fw_domains & ~uncore->fw_domains_active);
> +
> +	/*
> +	 * Check that the caller has an explicit wakeref and we don't mistake
> +	 * it for the auto wakeref.
> +	 */
> +	local_irq_disable();
> +	for_each_fw_domain_masked(domain, fw_domains, uncore, tmp) {
> +		unsigned int expect = 1;
> +
> +		if (hrtimer_active(&domain->timer))
> +			expect++;
> +
> +		if (WARN(domain->wake_count < expect,
> +			 "Expected domain %d to be held awake by caller\n",
> +			 domain->id))
> +			break;
> +	}
> +	local_irq_enable();

This part looks good. Let wait and see if CI calls me a fool.

Regards,

Tvrtko

>   }
>   
>   /* We give fast paths for the really cool registers */
>
Chris Wilson July 3, 2019, 12:48 p.m. UTC | #2
Quoting Tvrtko Ursulin (2019-07-03 13:43:34)
> 
> On 03/07/2019 13:12, Chris Wilson wrote:
> > The intent of the assert is to document that the caller took the
> > appropriate wakerefs for the function. However, as Tvrtko pointed out,
> > we simply check whether the fw_domains are active and may be confused by
> > the auto wakeref which may be dropped between the check and use. Let's
> > be more careful in the assert and check that each fw_domain has an
> > explicit caller wakeref above and beyond the automatic wakeref.
> 
> Although we still don't know if it is our caller who took the fw or some 
> unrelated thread. Still, a more thorough check is better even if not 
> perfect.

Since it's not a mutex, we can't stuff an owner field in here, the only
way to properly track in that case would be to return cookies from
forcewake_get and verify the cookies are still active.

Not feeling the inclination to do that yet. Maybe if we get a few fw
leaks.

> > Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > ---
> >   drivers/gpu/drm/i915/intel_uncore.c | 24 ++++++++++++++++++++++++
> >   1 file changed, 24 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> > index 68d54e126d79..bc25a6e51463 100644
> > --- a/drivers/gpu/drm/i915/intel_uncore.c
> > +++ b/drivers/gpu/drm/i915/intel_uncore.c
> > @@ -738,6 +738,12 @@ void assert_forcewakes_inactive(struct intel_uncore *uncore)
> >   void assert_forcewakes_active(struct intel_uncore *uncore,
> >                             enum forcewake_domains fw_domains)
> >   {
> > +     struct intel_uncore_forcewake_domain *domain;
> > +     unsigned int tmp;
> > +
> > +     if (!IS_ENABLED(CONFIG_DRM_i915_RUNTIME_PM))
> > +             return;
> > +
> 
> If uncore->funcs.force_wake_get is set why wouldn't we still want to run 
> the asserts?

I'm just being worried by adding a loop under irq-off and didn't want to
add more trouble to non-debug kernels. (Closing the stable door much?)
> 
> >       if (!uncore->funcs.force_wake_get)
> >               return;
> >   
> > @@ -747,6 +753,24 @@ void assert_forcewakes_active(struct intel_uncore *uncore,
> >       WARN(fw_domains & ~uncore->fw_domains_active,
> >            "Expected %08x fw_domains to be active, but %08x are off\n",
> >            fw_domains, fw_domains & ~uncore->fw_domains_active);
> > +
> > +     /*
> > +      * Check that the caller has an explicit wakeref and we don't mistake
> > +      * it for the auto wakeref.
> > +      */
> > +     local_irq_disable();
> > +     for_each_fw_domain_masked(domain, fw_domains, uncore, tmp) {
> > +             unsigned int expect = 1;
> > +
> > +             if (hrtimer_active(&domain->timer))
> > +                     expect++;
> > +
> > +             if (WARN(domain->wake_count < expect,
> > +                      "Expected domain %d to be held awake by caller\n",
> > +                      domain->id))
> > +                     break;
> > +     }
> > +     local_irq_enable();
> 
> This part looks good. Let wait and see if CI calls me a fool.

Aye, that's what I'm waiting for as well. Personalized insults from CI :)
-Chris
Tvrtko Ursulin July 4, 2019, 8:10 a.m. UTC | #3
On 03/07/2019 13:48, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-07-03 13:43:34)
>>
>> On 03/07/2019 13:12, Chris Wilson wrote:
>>> The intent of the assert is to document that the caller took the
>>> appropriate wakerefs for the function. However, as Tvrtko pointed out,
>>> we simply check whether the fw_domains are active and may be confused by
>>> the auto wakeref which may be dropped between the check and use. Let's
>>> be more careful in the assert and check that each fw_domain has an
>>> explicit caller wakeref above and beyond the automatic wakeref.
>>
>> Although we still don't know if it is our caller who took the fw or some
>> unrelated thread. Still, a more thorough check is better even if not
>> perfect.
> 
> Since it's not a mutex, we can't stuff an owner field in here, the only
> way to properly track in that case would be to return cookies from
> forcewake_get and verify the cookies are still active.
> 
> Not feeling the inclination to do that yet. Maybe if we get a few fw
> leaks.

Agreed, I wasn't trying to say we have to do it.

> 
>>> Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/intel_uncore.c | 24 ++++++++++++++++++++++++
>>>    1 file changed, 24 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
>>> index 68d54e126d79..bc25a6e51463 100644
>>> --- a/drivers/gpu/drm/i915/intel_uncore.c
>>> +++ b/drivers/gpu/drm/i915/intel_uncore.c
>>> @@ -738,6 +738,12 @@ void assert_forcewakes_inactive(struct intel_uncore *uncore)
>>>    void assert_forcewakes_active(struct intel_uncore *uncore,
>>>                              enum forcewake_domains fw_domains)
>>>    {
>>> +     struct intel_uncore_forcewake_domain *domain;
>>> +     unsigned int tmp;
>>> +
>>> +     if (!IS_ENABLED(CONFIG_DRM_i915_RUNTIME_PM))
>>> +             return;
>>> +
>>
>> If uncore->funcs.force_wake_get is set why wouldn't we still want to run
>> the asserts?
> 
> I'm just being worried by adding a loop under irq-off and didn't want to
> add more trouble to non-debug kernels. (Closing the stable door much?)

What is the connection between debug/non-debug kernels and 
CONFIG_DRM_i915_RUNTIME_PM?

Regards,

Tvrtko

>>
>>>        if (!uncore->funcs.force_wake_get)
>>>                return;
>>>    
>>> @@ -747,6 +753,24 @@ void assert_forcewakes_active(struct intel_uncore *uncore,
>>>        WARN(fw_domains & ~uncore->fw_domains_active,
>>>             "Expected %08x fw_domains to be active, but %08x are off\n",
>>>             fw_domains, fw_domains & ~uncore->fw_domains_active);
>>> +
>>> +     /*
>>> +      * Check that the caller has an explicit wakeref and we don't mistake
>>> +      * it for the auto wakeref.
>>> +      */
>>> +     local_irq_disable();
>>> +     for_each_fw_domain_masked(domain, fw_domains, uncore, tmp) {
>>> +             unsigned int expect = 1;
>>> +
>>> +             if (hrtimer_active(&domain->timer))
>>> +                     expect++;
>>> +
>>> +             if (WARN(domain->wake_count < expect,
>>> +                      "Expected domain %d to be held awake by caller\n",
>>> +                      domain->id))
>>> +                     break;
>>> +     }
>>> +     local_irq_enable();
>>
>> This part looks good. Let wait and see if CI calls me a fool.
> 
> Aye, that's what I'm waiting for as well. Personalized insults from CI :)
> -Chris
>
Chris Wilson July 4, 2019, 8:15 a.m. UTC | #4
Quoting Tvrtko Ursulin (2019-07-04 09:10:32)
> 
> On 03/07/2019 13:48, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-07-03 13:43:34)
> >>
> >> On 03/07/2019 13:12, Chris Wilson wrote:
> >>> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> >>> index 68d54e126d79..bc25a6e51463 100644
> >>> --- a/drivers/gpu/drm/i915/intel_uncore.c
> >>> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> >>> @@ -738,6 +738,12 @@ void assert_forcewakes_inactive(struct intel_uncore *uncore)
> >>>    void assert_forcewakes_active(struct intel_uncore *uncore,
> >>>                              enum forcewake_domains fw_domains)
> >>>    {
> >>> +     struct intel_uncore_forcewake_domain *domain;
> >>> +     unsigned int tmp;
> >>> +
> >>> +     if (!IS_ENABLED(CONFIG_DRM_i915_RUNTIME_PM))
> >>> +             return;
> >>> +
> >>
> >> If uncore->funcs.force_wake_get is set why wouldn't we still want to run
> >> the asserts?
> > 
> > I'm just being worried by adding a loop under irq-off and didn't want to
> > add more trouble to non-debug kernels. (Closing the stable door much?)
> 
> What is the connection between debug/non-debug kernels and 
> CONFIG_DRM_i915_RUNTIME_PM?

Well, I can't type obviously. It's meant to be the symbol for
	config DRM_I915_DEBUG_RUNTIME_PM
to match the other assert in use.
-Chris
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 68d54e126d79..bc25a6e51463 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -738,6 +738,12 @@  void assert_forcewakes_inactive(struct intel_uncore *uncore)
 void assert_forcewakes_active(struct intel_uncore *uncore,
 			      enum forcewake_domains fw_domains)
 {
+	struct intel_uncore_forcewake_domain *domain;
+	unsigned int tmp;
+
+	if (!IS_ENABLED(CONFIG_DRM_i915_RUNTIME_PM))
+		return;
+
 	if (!uncore->funcs.force_wake_get)
 		return;
 
@@ -747,6 +753,24 @@  void assert_forcewakes_active(struct intel_uncore *uncore,
 	WARN(fw_domains & ~uncore->fw_domains_active,
 	     "Expected %08x fw_domains to be active, but %08x are off\n",
 	     fw_domains, fw_domains & ~uncore->fw_domains_active);
+
+	/*
+	 * Check that the caller has an explicit wakeref and we don't mistake
+	 * it for the auto wakeref.
+	 */
+	local_irq_disable();
+	for_each_fw_domain_masked(domain, fw_domains, uncore, tmp) {
+		unsigned int expect = 1;
+
+		if (hrtimer_active(&domain->timer))
+			expect++;
+
+		if (WARN(domain->wake_count < expect,
+			 "Expected domain %d to be held awake by caller\n",
+			 domain->id))
+			break;
+	}
+	local_irq_enable();
 }
 
 /* We give fast paths for the really cool registers */