diff mbox

drm/vc4: Fix false positive WARN() backtrace on refcount_inc() usage

Message ID 20171122203928.28135-1-boris.brezillon@free-electrons.com (mailing list archive)
State New, archived
Headers show

Commit Message

Boris BREZILLON Nov. 22, 2017, 8:39 p.m. UTC
With CONFIG_REFCOUNT_FULL enabled, refcount_inc() complains when it's
passed a refcount object that has its counter set to 0. In this driver,
this is a valid use case since we want to increment ->usecnt only when
the BO object starts to be used by real HW components and this is
definitely not the case when the BO is created.

Fix the problem by using refcount_inc_not_zero() instead of
refcount_inc() and fallback to refcount_set(1) when
refcount_inc_not_zero() returns false. Note that this 2-steps operation
is not racy here because the whole section is protected by a mutex
which guarantees that the counter does not change between the
refcount_inc_not_zero() and refcount_set() calls.

Fixes: b9f19259b84d ("drm/vc4: Add the DRM_IOCTL_VC4_GEM_MADVISE ioctl")
Reported-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
---
 drivers/gpu/drm/vc4/vc4_bo.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Eric Anholt Nov. 22, 2017, 9:16 p.m. UTC | #1
Boris Brezillon <boris.brezillon@free-electrons.com> writes:

> With CONFIG_REFCOUNT_FULL enabled, refcount_inc() complains when it's
> passed a refcount object that has its counter set to 0. In this driver,
> this is a valid use case since we want to increment ->usecnt only when
> the BO object starts to be used by real HW components and this is
> definitely not the case when the BO is created.
>
> Fix the problem by using refcount_inc_not_zero() instead of
> refcount_inc() and fallback to refcount_set(1) when
> refcount_inc_not_zero() returns false. Note that this 2-steps operation
> is not racy here because the whole section is protected by a mutex
> which guarantees that the counter does not change between the
> refcount_inc_not_zero() and refcount_set() calls.

If we're not following the model, and protecting the refcount by a
mutex, shouldn't we just be using addition and subtraction instead of
refcount's atomics?
Boris BREZILLON Nov. 22, 2017, 9:28 p.m. UTC | #2
On Wed, 22 Nov 2017 13:16:08 -0800
Eric Anholt <eric@anholt.net> wrote:

> Boris Brezillon <boris.brezillon@free-electrons.com> writes:
> 
> > With CONFIG_REFCOUNT_FULL enabled, refcount_inc() complains when it's
> > passed a refcount object that has its counter set to 0. In this driver,
> > this is a valid use case since we want to increment ->usecnt only when
> > the BO object starts to be used by real HW components and this is
> > definitely not the case when the BO is created.
> >
> > Fix the problem by using refcount_inc_not_zero() instead of
> > refcount_inc() and fallback to refcount_set(1) when
> > refcount_inc_not_zero() returns false. Note that this 2-steps operation
> > is not racy here because the whole section is protected by a mutex
> > which guarantees that the counter does not change between the
> > refcount_inc_not_zero() and refcount_set() calls.  
> 
> If we're not following the model, and protecting the refcount by a
> mutex, shouldn't we just be using addition and subtraction instead of
> refcount's atomics?

Actually, this mutex is protecting the bo->madv value which has to be
checked when the counter reaches 0 (when decrementing) or 1 (when
incrementing). We just benefit from this protection here, but ideally
it would be better to have an refcount_inc_allow_zero() as suggested by
Daniel.

Anyway, I can also use a regular counter and protect it with the madv
mutex, it's just that I thought using the existing refcount
infrastructure would be safer than open-coding it (actually, I found
several unbalanced refcounting issues thanks to this API while
developing the feature).
Eric Anholt Nov. 23, 2017, 12:13 a.m. UTC | #3
Boris Brezillon <boris.brezillon@free-electrons.com> writes:

> On Wed, 22 Nov 2017 13:16:08 -0800
> Eric Anholt <eric@anholt.net> wrote:
>
>> Boris Brezillon <boris.brezillon@free-electrons.com> writes:
>> 
>> > With CONFIG_REFCOUNT_FULL enabled, refcount_inc() complains when it's
>> > passed a refcount object that has its counter set to 0. In this driver,
>> > this is a valid use case since we want to increment ->usecnt only when
>> > the BO object starts to be used by real HW components and this is
>> > definitely not the case when the BO is created.
>> >
>> > Fix the problem by using refcount_inc_not_zero() instead of
>> > refcount_inc() and fallback to refcount_set(1) when
>> > refcount_inc_not_zero() returns false. Note that this 2-steps operation
>> > is not racy here because the whole section is protected by a mutex
>> > which guarantees that the counter does not change between the
>> > refcount_inc_not_zero() and refcount_set() calls.  
>> 
>> If we're not following the model, and protecting the refcount by a
>> mutex, shouldn't we just be using addition and subtraction instead of
>> refcount's atomics?
>
> Actually, this mutex is protecting the bo->madv value which has to be
> checked when the counter reaches 0 (when decrementing) or 1 (when
> incrementing). We just benefit from this protection here, but ideally
> it would be better to have an refcount_inc_allow_zero() as suggested by
> Daniel.

Let me restate this to see if it makes sense: The refcount is always >=
0, this is is the only path that increases the refcount from 0 to 1, and
it's (incidentally) protected by a mutex, so there's no race between the
attempted increase from nonzero and the set from nonzero to 1.

This seems fine to me as a bugfix.
Boris BREZILLON Nov. 23, 2017, 8:24 a.m. UTC | #4
On Wed, 22 Nov 2017 16:13:31 -0800
Eric Anholt <eric@anholt.net> wrote:

> Boris Brezillon <boris.brezillon@free-electrons.com> writes:
> 
> > On Wed, 22 Nov 2017 13:16:08 -0800
> > Eric Anholt <eric@anholt.net> wrote:
> >  
> >> Boris Brezillon <boris.brezillon@free-electrons.com> writes:
> >>   
> >> > With CONFIG_REFCOUNT_FULL enabled, refcount_inc() complains when it's
> >> > passed a refcount object that has its counter set to 0. In this driver,
> >> > this is a valid use case since we want to increment ->usecnt only when
> >> > the BO object starts to be used by real HW components and this is
> >> > definitely not the case when the BO is created.
> >> >
> >> > Fix the problem by using refcount_inc_not_zero() instead of
> >> > refcount_inc() and fallback to refcount_set(1) when
> >> > refcount_inc_not_zero() returns false. Note that this 2-steps operation
> >> > is not racy here because the whole section is protected by a mutex
> >> > which guarantees that the counter does not change between the
> >> > refcount_inc_not_zero() and refcount_set() calls.    
> >> 
> >> If we're not following the model, and protecting the refcount by a
> >> mutex, shouldn't we just be using addition and subtraction instead of
> >> refcount's atomics?  
> >
> > Actually, this mutex is protecting the bo->madv value which has to be
> > checked when the counter reaches 0 (when decrementing) or 1 (when
> > incrementing). We just benefit from this protection here, but ideally
> > it would be better to have an refcount_inc_allow_zero() as suggested by
> > Daniel.  
> 
> Let me restate this to see if it makes sense: The refcount is always >=
> 0, this is is the only path that increases the refcount from 0 to 1, and
> it's (incidentally) protected by a mutex, so there's no race between the
> attempted increase from nonzero and the set from nonzero to 1.

Yep.

> 
> This seems fine to me as a bugfix.

The discussion is going on in the other thread, let's wait a bit
before taking a decision.
Daniel Vetter Nov. 24, 2017, 2:52 p.m. UTC | #5
On Thu, Nov 23, 2017 at 09:24:11AM +0100, Boris Brezillon wrote:
> On Wed, 22 Nov 2017 16:13:31 -0800
> Eric Anholt <eric@anholt.net> wrote:
> 
> > Boris Brezillon <boris.brezillon@free-electrons.com> writes:
> > 
> > > On Wed, 22 Nov 2017 13:16:08 -0800
> > > Eric Anholt <eric@anholt.net> wrote:
> > >  
> > >> Boris Brezillon <boris.brezillon@free-electrons.com> writes:
> > >>   
> > >> > With CONFIG_REFCOUNT_FULL enabled, refcount_inc() complains when it's
> > >> > passed a refcount object that has its counter set to 0. In this driver,
> > >> > this is a valid use case since we want to increment ->usecnt only when
> > >> > the BO object starts to be used by real HW components and this is
> > >> > definitely not the case when the BO is created.
> > >> >
> > >> > Fix the problem by using refcount_inc_not_zero() instead of
> > >> > refcount_inc() and fallback to refcount_set(1) when
> > >> > refcount_inc_not_zero() returns false. Note that this 2-steps operation
> > >> > is not racy here because the whole section is protected by a mutex
> > >> > which guarantees that the counter does not change between the
> > >> > refcount_inc_not_zero() and refcount_set() calls.    
> > >> 
> > >> If we're not following the model, and protecting the refcount by a
> > >> mutex, shouldn't we just be using addition and subtraction instead of
> > >> refcount's atomics?  
> > >
> > > Actually, this mutex is protecting the bo->madv value which has to be
> > > checked when the counter reaches 0 (when decrementing) or 1 (when
> > > incrementing). We just benefit from this protection here, but ideally
> > > it would be better to have an refcount_inc_allow_zero() as suggested by
> > > Daniel.  
> > 
> > Let me restate this to see if it makes sense: The refcount is always >=
> > 0, this is is the only path that increases the refcount from 0 to 1, and
> > it's (incidentally) protected by a mutex, so there's no race between the
> > attempted increase from nonzero and the set from nonzero to 1.
> 
> Yep.
> 
> > 
> > This seems fine to me as a bugfix.
> 
> The discussion is going on in the other thread, let's wait a bit
> before taking a decision.

Let's not block the bugfix on reaching perfection imo. I'd merge this as
the minimal fix, and then apply the pretty paint once we have a clear idea
for that.
-Daniel
Stefan Wahren Dec. 7, 2017, 12:31 p.m. UTC | #6
Hi Daniel,


Am 24.11.2017 um 15:52 schrieb Daniel Vetter:
> On Thu, Nov 23, 2017 at 09:24:11AM +0100, Boris Brezillon wrote:
>> On Wed, 22 Nov 2017 16:13:31 -0800
>> Eric Anholt <eric@anholt.net> wrote:
>>
>>> Boris Brezillon <boris.brezillon@free-electrons.com> writes:
>>>
>>>> On Wed, 22 Nov 2017 13:16:08 -0800
>>>> Eric Anholt <eric@anholt.net> wrote:
>>>>   
>>>>> Boris Brezillon <boris.brezillon@free-electrons.com> writes:
>>>>>    
>>>>>> With CONFIG_REFCOUNT_FULL enabled, refcount_inc() complains when it's
>>>>>> passed a refcount object that has its counter set to 0. In this driver,
>>>>>> this is a valid use case since we want to increment ->usecnt only when
>>>>>> the BO object starts to be used by real HW components and this is
>>>>>> definitely not the case when the BO is created.
>>>>>>
>>>>>> Fix the problem by using refcount_inc_not_zero() instead of
>>>>>> refcount_inc() and fallback to refcount_set(1) when
>>>>>> refcount_inc_not_zero() returns false. Note that this 2-steps operation
>>>>>> is not racy here because the whole section is protected by a mutex
>>>>>> which guarantees that the counter does not change between the
>>>>>> refcount_inc_not_zero() and refcount_set() calls.
>>>>> If we're not following the model, and protecting the refcount by a
>>>>> mutex, shouldn't we just be using addition and subtraction instead of
>>>>> refcount's atomics?
>>>> Actually, this mutex is protecting the bo->madv value which has to be
>>>> checked when the counter reaches 0 (when decrementing) or 1 (when
>>>> incrementing). We just benefit from this protection here, but ideally
>>>> it would be better to have an refcount_inc_allow_zero() as suggested by
>>>> Daniel.
>>> Let me restate this to see if it makes sense: The refcount is always >=
>>> 0, this is is the only path that increases the refcount from 0 to 1, and
>>> it's (incidentally) protected by a mutex, so there's no race between the
>>> attempted increase from nonzero and the set from nonzero to 1.
>> Yep.
>>
>>> This seems fine to me as a bugfix.
>> The discussion is going on in the other thread, let's wait a bit
>> before taking a decision.
> Let's not block the bugfix on reaching perfection imo. I'd merge this as
> the minimal fix, and then apply the pretty paint once we have a clear idea
> for that.
> -Daniel

sorry but i can't find it in the DRI tree.

Regards
Stefan
Boris BREZILLON Dec. 7, 2017, 12:56 p.m. UTC | #7
On Thu, 7 Dec 2017 13:31:34 +0100
Stefan Wahren <stefan.wahren@i2se.com> wrote:

> Hi Daniel,
> 
> 
> Am 24.11.2017 um 15:52 schrieb Daniel Vetter:
> > On Thu, Nov 23, 2017 at 09:24:11AM +0100, Boris Brezillon wrote:  
> >> On Wed, 22 Nov 2017 16:13:31 -0800
> >> Eric Anholt <eric@anholt.net> wrote:
> >>  
> >>> Boris Brezillon <boris.brezillon@free-electrons.com> writes:
> >>>  
> >>>> On Wed, 22 Nov 2017 13:16:08 -0800
> >>>> Eric Anholt <eric@anholt.net> wrote:
> >>>>     
> >>>>> Boris Brezillon <boris.brezillon@free-electrons.com> writes:
> >>>>>      
> >>>>>> With CONFIG_REFCOUNT_FULL enabled, refcount_inc() complains when it's
> >>>>>> passed a refcount object that has its counter set to 0. In this driver,
> >>>>>> this is a valid use case since we want to increment ->usecnt only when
> >>>>>> the BO object starts to be used by real HW components and this is
> >>>>>> definitely not the case when the BO is created.
> >>>>>>
> >>>>>> Fix the problem by using refcount_inc_not_zero() instead of
> >>>>>> refcount_inc() and fallback to refcount_set(1) when
> >>>>>> refcount_inc_not_zero() returns false. Note that this 2-steps operation
> >>>>>> is not racy here because the whole section is protected by a mutex
> >>>>>> which guarantees that the counter does not change between the
> >>>>>> refcount_inc_not_zero() and refcount_set() calls.  
> >>>>> If we're not following the model, and protecting the refcount by a
> >>>>> mutex, shouldn't we just be using addition and subtraction instead of
> >>>>> refcount's atomics?  
> >>>> Actually, this mutex is protecting the bo->madv value which has to be
> >>>> checked when the counter reaches 0 (when decrementing) or 1 (when
> >>>> incrementing). We just benefit from this protection here, but ideally
> >>>> it would be better to have an refcount_inc_allow_zero() as suggested by
> >>>> Daniel.  
> >>> Let me restate this to see if it makes sense: The refcount is always >=
> >>> 0, this is is the only path that increases the refcount from 0 to 1, and
> >>> it's (incidentally) protected by a mutex, so there's no race between the
> >>> attempted increase from nonzero and the set from nonzero to 1.  
> >> Yep.
> >>  
> >>> This seems fine to me as a bugfix.  
> >> The discussion is going on in the other thread, let's wait a bit
> >> before taking a decision.  
> > Let's not block the bugfix on reaching perfection imo. I'd merge this as
> > the minimal fix, and then apply the pretty paint once we have a clear idea
> > for that.
> > -Daniel  
> 
> sorry but i can't find it in the DRI tree.

Sorry for the delay. I applied it this morning [1] and it should appear
in the next -rc.

Regards,

Boris

[1]https://cgit.freedesktop.org/drm/drm-misc/log/?h=drm-misc-fixes
diff mbox

Patch

diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c
index 4ae45d7dac42..2decc8e2c79f 100644
--- a/drivers/gpu/drm/vc4/vc4_bo.c
+++ b/drivers/gpu/drm/vc4/vc4_bo.c
@@ -637,7 +637,8 @@  int vc4_bo_inc_usecnt(struct vc4_bo *bo)
 	mutex_lock(&bo->madv_lock);
 	switch (bo->madv) {
 	case VC4_MADV_WILLNEED:
-		refcount_inc(&bo->usecnt);
+		if (!refcount_inc_not_zero(&bo->usecnt))
+			refcount_set(&bo->usecnt, 1);
 		ret = 0;
 		break;
 	case VC4_MADV_DONTNEED: