diff mbox series

drm/i915/active: Fix missing debug object activation

Message ID 20230310141138.6592-1-nirmoy.das@intel.com (mailing list archive)
State New, archived
Headers show
Series drm/i915/active: Fix missing debug object activation | expand

Commit Message

Nirmoy Das March 10, 2023, 2:11 p.m. UTC
debug_active_activate() expected ref->count to be zero
which is not true anymore as __i915_active_activate() calls
debug_active_activate() after incrementing the count.

Fixes: 04240e30ed06 ("drm/i915: Skip taking acquire mutex for no ref->active callback")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@intel.com>
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Cc: intel-gfx@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v5.10+
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
---
 drivers/gpu/drm/i915/i915_active.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Janusz Krzysztofik March 10, 2023, 3:19 p.m. UTC | #1
Hi Nirmoy,

On Friday, 10 March 2023 15:11:38 CET Nirmoy Das wrote:
> debug_active_activate() expected ref->count to be zero
> which is not true anymore as __i915_active_activate() calls
> debug_active_activate() after incrementing the count.
> 
> Fixes: 04240e30ed06 ("drm/i915: Skip taking acquire mutex for no ref->active 
callback")
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Thomas Hellström <thomas.hellstrom@intel.com>
> Cc: Andi Shyti <andi.shyti@linux.intel.com>
> Cc: intel-gfx@lists.freedesktop.org
> Cc: <stable@vger.kernel.org> # v5.10+
> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_active.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/
i915_active.c
> index a9fea115f2d2..1c3066eb359a 100644
> --- a/drivers/gpu/drm/i915/i915_active.c
> +++ b/drivers/gpu/drm/i915/i915_active.c
> @@ -92,7 +92,7 @@ static void debug_active_init(struct i915_active *ref)
>  static void debug_active_activate(struct i915_active *ref)
>  {
>  	lockdep_assert_held(&ref->tree_lock);
> -	if (!atomic_read(&ref->count)) /* before the first inc */
> +	if (atomic_read(&ref->count) == 1) /* after the first inc */

While that's obviously better than never calling debug_active_activate(), I'm 
wondering how likely we can still miss it because the counter being 
incremented, e.g. via i915_active_acquire_if_busy(), by a concurrent thread.  
Since __i915_active_activate() is the only user and its decision making step 
is serialized against itself with a spinlock, couldn't we better call 
debug_object_activate() unconditionally here?

Thanks,
Janusz

>  		debug_object_activate(ref, &active_debug_desc);
>  }
>  
>
Nirmoy Das March 10, 2023, 4:48 p.m. UTC | #2
Hi Janusz,

On 3/10/2023 4:19 PM, Janusz Krzysztofik wrote:
> Hi Nirmoy,
>
> On Friday, 10 March 2023 15:11:38 CET Nirmoy Das wrote:
>> debug_active_activate() expected ref->count to be zero
>> which is not true anymore as __i915_active_activate() calls
>> debug_active_activate() after incrementing the count.
>>
>> Fixes: 04240e30ed06 ("drm/i915: Skip taking acquire mutex for no ref->active
> callback")
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Thomas Hellström <thomas.hellstrom@intel.com>
>> Cc: Andi Shyti <andi.shyti@linux.intel.com>
>> Cc: intel-gfx@lists.freedesktop.org
>> Cc: <stable@vger.kernel.org> # v5.10+
>> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_active.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/
> i915_active.c
>> index a9fea115f2d2..1c3066eb359a 100644
>> --- a/drivers/gpu/drm/i915/i915_active.c
>> +++ b/drivers/gpu/drm/i915/i915_active.c
>> @@ -92,7 +92,7 @@ static void debug_active_init(struct i915_active *ref)
>>   static void debug_active_activate(struct i915_active *ref)
>>   {
>>   	lockdep_assert_held(&ref->tree_lock);
>> -	if (!atomic_read(&ref->count)) /* before the first inc */
>> +	if (atomic_read(&ref->count) == 1) /* after the first inc */
> While that's obviously better than never calling debug_active_activate(), I'm
> wondering how likely we can still miss it because the counter being
> incremented, e.g. via i915_active_acquire_if_busy(), by a concurrent thread.
> Since __i915_active_activate() is the only user and its decision making step
> is serialized against itself with a spinlock, couldn't we better call
> debug_object_activate() unconditionally here?


Yes, we can call debug_object_activate() without checking ref->count. 
Also we can remove the ref-count check for

debug_active_deactivate() as this is wrapped with 
"atomic_dec_and_lock_irqsave(&ref->count, &ref->tree_lock, flags)".


I think it makes sense to keep this patch as it is so it can be 
backported easily. I can add another patch to remove

unnecessary ref->count  checks.


Regards,

Nirmoy


>
> Thanks,
> Janusz
>
>>   		debug_object_activate(ref, &active_debug_desc);
>>   }
>>   
>>
>
>
>
Janusz Krzysztofik March 13, 2023, 9:55 a.m. UTC | #3
On Friday, 10 March 2023 17:48:10 CET Das, Nirmoy wrote:
> Hi Janusz,
> 
> On 3/10/2023 4:19 PM, Janusz Krzysztofik wrote:
> > Hi Nirmoy,
> >
> > On Friday, 10 March 2023 15:11:38 CET Nirmoy Das wrote:
> >> debug_active_activate() expected ref->count to be zero
> >> which is not true anymore as __i915_active_activate() calls
> >> debug_active_activate() after incrementing the count.
> >>
> >> Fixes: 04240e30ed06 ("drm/i915: Skip taking acquire mutex for no ref-
>active
> > callback")
> >> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> >> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >> Cc: Thomas Hellström <thomas.hellstrom@intel.com>
> >> Cc: Andi Shyti <andi.shyti@linux.intel.com>
> >> Cc: intel-gfx@lists.freedesktop.org
> >> Cc: <stable@vger.kernel.org> # v5.10+
> >> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
> >> ---
> >>   drivers/gpu/drm/i915/i915_active.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/
> > i915_active.c
> >> index a9fea115f2d2..1c3066eb359a 100644
> >> --- a/drivers/gpu/drm/i915/i915_active.c
> >> +++ b/drivers/gpu/drm/i915/i915_active.c
> >> @@ -92,7 +92,7 @@ static void debug_active_init(struct i915_active *ref)
> >>   static void debug_active_activate(struct i915_active *ref)
> >>   {
> >>   	lockdep_assert_held(&ref->tree_lock);
> >> -	if (!atomic_read(&ref->count)) /* before the first inc */
> >> +	if (atomic_read(&ref->count) == 1) /* after the first inc */
> > While that's obviously better than never calling debug_active_activate(), 
I'm
> > wondering how likely we can still miss it because the counter being
> > incremented, e.g. via i915_active_acquire_if_busy(), by a concurrent 
thread.
> > Since __i915_active_activate() is the only user and its decision making 
step
> > is serialized against itself with a spinlock, couldn't we better call
> > debug_object_activate() unconditionally here?
> 
> 
> Yes, we can call debug_object_activate() without checking ref->count. 
> Also we can remove the ref-count check for
> 
> debug_active_deactivate() as this is wrapped with 
> "atomic_dec_and_lock_irqsave(&ref->count, &ref->tree_lock, flags)".
> 
> 
> I think it makes sense to keep this patch as it is so it can be 
> backported easily. I can add another patch to remove
> 
> unnecessary ref->count  checks.

Looking at 5.10, I can't understand how dropping the check instead of 
replacing it with a still problematic one could make backporting less easy.

Thanks,
Janusz


> 
> 
> Regards,
> 
> Nirmoy
> 
> 
> >
> > Thanks,
> > Janusz
> >
> >>   		debug_object_activate(ref, &active_debug_desc);
> >>   }
> >>   
> >>
> >
> >
> >
>
Nirmoy Das March 13, 2023, 10:33 a.m. UTC | #4
On 3/13/2023 10:55 AM, Janusz Krzysztofik wrote:
> On Friday, 10 March 2023 17:48:10 CET Das, Nirmoy wrote:
>> Hi Janusz,
>>
>> On 3/10/2023 4:19 PM, Janusz Krzysztofik wrote:
>>> Hi Nirmoy,
>>>
>>> On Friday, 10 March 2023 15:11:38 CET Nirmoy Das wrote:
>>>> debug_active_activate() expected ref->count to be zero
>>>> which is not true anymore as __i915_active_activate() calls
>>>> debug_active_activate() after incrementing the count.
>>>>
>>>> Fixes: 04240e30ed06 ("drm/i915: Skip taking acquire mutex for no ref-
>> active
>>> callback")
>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>> Cc: Thomas Hellström <thomas.hellstrom@intel.com>
>>>> Cc: Andi Shyti <andi.shyti@linux.intel.com>
>>>> Cc: intel-gfx@lists.freedesktop.org
>>>> Cc: <stable@vger.kernel.org> # v5.10+
>>>> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
>>>> ---
>>>>    drivers/gpu/drm/i915/i915_active.c | 2 +-
>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/
>>> i915_active.c
>>>> index a9fea115f2d2..1c3066eb359a 100644
>>>> --- a/drivers/gpu/drm/i915/i915_active.c
>>>> +++ b/drivers/gpu/drm/i915/i915_active.c
>>>> @@ -92,7 +92,7 @@ static void debug_active_init(struct i915_active *ref)
>>>>    static void debug_active_activate(struct i915_active *ref)
>>>>    {
>>>>    	lockdep_assert_held(&ref->tree_lock);
>>>> -	if (!atomic_read(&ref->count)) /* before the first inc */
>>>> +	if (atomic_read(&ref->count) == 1) /* after the first inc */
>>> While that's obviously better than never calling debug_active_activate(),
> I'm
>>> wondering how likely we can still miss it because the counter being
>>> incremented, e.g. via i915_active_acquire_if_busy(), by a concurrent
> thread.
>>> Since __i915_active_activate() is the only user and its decision making
> step
>>> is serialized against itself with a spinlock, couldn't we better call
>>> debug_object_activate() unconditionally here?
>>
>> Yes, we can call debug_object_activate() without checking ref->count.
>> Also we can remove the ref-count check for
>>
>> debug_active_deactivate() as this is wrapped with
>> "atomic_dec_and_lock_irqsave(&ref->count, &ref->tree_lock, flags)".
>>
>>
>> I think it makes sense to keep this patch as it is so it can be
>> backported easily. I can add another patch to remove
>>
>> unnecessary ref->count  checks.
> Looking at 5.10, I can't understand how dropping the check instead of
> replacing it with a still problematic one could make backporting less easy.


Indeed, I thought 5.10 is pretty far in the past but I was wrong. I can 
apply the modified patch.

Sent out a v2

Thanks,

Nirmoy

>
> Thanks,
> Janusz
>
>
>>
>> Regards,
>>
>> Nirmoy
>>
>>
>>> Thanks,
>>> Janusz
>>>
>>>>    		debug_object_activate(ref, &active_debug_desc);
>>>>    }
>>>>    
>>>>
>>>
>>>
>
>
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c
index a9fea115f2d2..1c3066eb359a 100644
--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -92,7 +92,7 @@  static void debug_active_init(struct i915_active *ref)
 static void debug_active_activate(struct i915_active *ref)
 {
 	lockdep_assert_held(&ref->tree_lock);
-	if (!atomic_read(&ref->count)) /* before the first inc */
+	if (atomic_read(&ref->count) == 1) /* after the first inc */
 		debug_object_activate(ref, &active_debug_desc);
 }