xen/pvcalls: fix potential endless loop in pvcalls-front.c

Message ID	alpine.DEB.2.10.1711101521490.22767@sstabellini-ThinkPad-X260 (mailing list archive)
State	New, archived
Headers	show Return-Path: <xen-devel-bounces@lists.xen.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8572F214C5 Date: Fri, 10 Nov 2017 15:57:57 -0800 (PST) From: Stefano Stabellini <sstabellini@kernel.org> To: Juergen Gross <jgross@suse.com> In-Reply-To: <ae9f82e5-4f58-a349-7dcb-513322d737d8@suse.com> Message-ID: <alpine.DEB.2.10.1711101521490.22767@sstabellini-ThinkPad-X260> References: <1510006679-6381-1-git-send-email-sstabellini@kernel.org> <ae9f82e5-4f58-a349-7dcb-513322d737d8@suse.com> User-Agent: Alpine 2.10 (DEB 1266 2009-07-14) MIME-Version: 1.0 Cc: xen-devel@lists.xenproject.org, boris.ostrovsky@oracle.com, Stefano Stabellini <sstabellini@kernel.org> Subject: Re: [Xen-devel] [PATCH] xen/pvcalls: fix potential endless loop in pvcalls-front.c Precedence: list Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" <xen-devel-bounces@lists.xen.org>

Stefano Stabellini Nov. 10, 2017, 11:57 p.m. UTC

On Tue, 7 Nov 2017, Juergen Gross wrote:
> On 06/11/17 23:17, Stefano Stabellini wrote:
> > mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> > take in_mutex on the first try, but you can't take out_mutex. Next times
> > you call mutex_trylock() in_mutex is going to fail. It's an endless
> > loop.
> > 
> > Solve the problem by moving the two mutex_trylock calls to two separate
> > loops.
> > 
> > Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> > Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
> > CC: boris.ostrovsky@oracle.com
> > CC: jgross@suse.com
> > ---
> >  drivers/xen/pvcalls-front.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> > index 0c1ec68..047dce7 100644
> > --- a/drivers/xen/pvcalls-front.c
> > +++ b/drivers/xen/pvcalls-front.c
> > @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
> >  		 * is set to NULL -- we only need to wait for the existing
> >  		 * waiters to return.
> >  		 */
> > -		while (!mutex_trylock(&map->active.in_mutex) ||
> > -			   !mutex_trylock(&map->active.out_mutex))
> > +		while (!mutex_trylock(&map->active.in_mutex))
> > +			cpu_relax();
> > +		while (!mutex_trylock(&map->active.out_mutex))
> >  			cpu_relax();
> 
> Any reason you don't just use mutex_lock()?

Hi Juergen, sorry for the late reply.

Yes, you are right. Given the patch, it would be just the same to use
mutex_lock.

This is where I realized that actually we have a problem: no matter if
we use mutex_lock or mutex_trylock, there are no guarantees that we'll
be the last to take the in/out_mutex. Other waiters could be still
outstanding.

We solved the same problem using a refcount in pvcalls_front_remove. In
this case, I was thinking of reusing the mutex internal counter for
efficiency, instead of adding one more refcount.

For using the mutex as a refcount, there is really no need to call
mutex_trylock or mutex_lock. I suggest checking on the mutex counter
directly:

		while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
		       atomic_long_read(&map->active.out_mutex.owner) != 0UL)
			cpu_relax();

Cheers,

Stefano

---

xen/pvcalls: fix potential endless loop in pvcalls-front.c

mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
take in_mutex on the first try, but you can't take out_mutex. Next time
you call mutex_trylock() in_mutex is going to fail. It's an endless
loop.

Actually, we don't want to use mutex_trylock at all: we don't need to
take the mutex, we only need to wait until the last mutex waiter/holder
releases it.

Instead of calling mutex_trylock or mutex_lock, just check on the mutex
refcount instead.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com

Boris Ostrovsky Nov. 11, 2017, 2:02 a.m. UTC | #1

On 11/10/2017 06:57 PM, Stefano Stabellini wrote:
> On Tue, 7 Nov 2017, Juergen Gross wrote:
>> On 06/11/17 23:17, Stefano Stabellini wrote:
>>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
>>> take in_mutex on the first try, but you can't take out_mutex. Next times
>>> you call mutex_trylock() in_mutex is going to fail. It's an endless
>>> loop.
>>>
>>> Solve the problem by moving the two mutex_trylock calls to two separate
>>> loops.
>>>
>>> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
>>> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
>>> CC: boris.ostrovsky@oracle.com
>>> CC: jgross@suse.com
>>> ---
>>>   drivers/xen/pvcalls-front.c | 5 +++--
>>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
>>> index 0c1ec68..047dce7 100644
>>> --- a/drivers/xen/pvcalls-front.c
>>> +++ b/drivers/xen/pvcalls-front.c
>>> @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
>>>   		 * is set to NULL -- we only need to wait for the existing
>>>   		 * waiters to return.
>>>   		 */
>>> -		while (!mutex_trylock(&map->active.in_mutex) ||
>>> -			   !mutex_trylock(&map->active.out_mutex))
>>> +		while (!mutex_trylock(&map->active.in_mutex))
>>> +			cpu_relax();
>>> +		while (!mutex_trylock(&map->active.out_mutex))
>>>   			cpu_relax();
>>
>> Any reason you don't just use mutex_lock()?
> 
> Hi Juergen, sorry for the late reply.
> 
> Yes, you are right. Given the patch, it would be just the same to use
> mutex_lock.
> 
> This is where I realized that actually we have a problem: no matter if
> we use mutex_lock or mutex_trylock, there are no guarantees that we'll
> be the last to take the in/out_mutex. Other waiters could be still
> outstanding.
> 
> We solved the same problem using a refcount in pvcalls_front_remove. In
> this case, I was thinking of reusing the mutex internal counter for
> efficiency, instead of adding one more refcount.
> 
> For using the mutex as a refcount, there is really no need to call
> mutex_trylock or mutex_lock. I suggest checking on the mutex counter
> directly:
> 


I think you want to say

	while(mutex_locked(&map->active.in_mutex.owner) ||
	      mutex_locked(&map->active.out_mutex.owner))
		cpu_relax();

since owner being NULL is not a documented value of a free mutex.

-boris

> 
> 		while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
> 		       atomic_long_read(&map->active.out_mutex.owner) != 0UL)
> 			cpu_relax();
> 
> Cheers,
> 
> Stefano
> 
> 
> ---
> 
> xen/pvcalls: fix potential endless loop in pvcalls-front.c
> 
> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> take in_mutex on the first try, but you can't take out_mutex. Next time
> you call mutex_trylock() in_mutex is going to fail. It's an endless
> loop.
> 
> Actually, we don't want to use mutex_trylock at all: we don't need to
> take the mutex, we only need to wait until the last mutex waiter/holder
> releases it.
> 
> Instead of calling mutex_trylock or mutex_lock, just check on the mutex
> refcount instead.
> 
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
> CC: boris.ostrovsky@oracle.com
> CC: jgross@suse.com
> 
> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> index 0c1ec68..9f33cb8 100644
> --- a/drivers/xen/pvcalls-front.c
> +++ b/drivers/xen/pvcalls-front.c
> @@ -1048,8 +1048,8 @@ int pvcalls_front_release(struct socket *sock)
>   		 * is set to NULL -- we only need to wait for the existing
>   		 * waiters to return.
>   		 */
> -		while (!mutex_trylock(&map->active.in_mutex) ||
> -			   !mutex_trylock(&map->active.out_mutex))
> +		while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
> +		       atomic_long_read(&map->active.out_mutex.owner) != 0UL)
>   			cpu_relax();
>   
>   		pvcalls_front_free_map(bedata, map);
>

Juergen Gross Nov. 13, 2017, 3:15 p.m. UTC | #2

On 11/11/17 00:57, Stefano Stabellini wrote:
> On Tue, 7 Nov 2017, Juergen Gross wrote:
>> On 06/11/17 23:17, Stefano Stabellini wrote:
>>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
>>> take in_mutex on the first try, but you can't take out_mutex. Next times
>>> you call mutex_trylock() in_mutex is going to fail. It's an endless
>>> loop.
>>>
>>> Solve the problem by moving the two mutex_trylock calls to two separate
>>> loops.
>>>
>>> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
>>> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
>>> CC: boris.ostrovsky@oracle.com
>>> CC: jgross@suse.com
>>> ---
>>>  drivers/xen/pvcalls-front.c | 5 +++--
>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
>>> index 0c1ec68..047dce7 100644
>>> --- a/drivers/xen/pvcalls-front.c
>>> +++ b/drivers/xen/pvcalls-front.c
>>> @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
>>>  		 * is set to NULL -- we only need to wait for the existing
>>>  		 * waiters to return.
>>>  		 */
>>> -		while (!mutex_trylock(&map->active.in_mutex) ||
>>> -			   !mutex_trylock(&map->active.out_mutex))
>>> +		while (!mutex_trylock(&map->active.in_mutex))
>>> +			cpu_relax();
>>> +		while (!mutex_trylock(&map->active.out_mutex))
>>>  			cpu_relax();
>>
>> Any reason you don't just use mutex_lock()?
> 
> Hi Juergen, sorry for the late reply.
> 
> Yes, you are right. Given the patch, it would be just the same to use
> mutex_lock.
> 
> This is where I realized that actually we have a problem: no matter if
> we use mutex_lock or mutex_trylock, there are no guarantees that we'll
> be the last to take the in/out_mutex. Other waiters could be still
> outstanding.
> 
> We solved the same problem using a refcount in pvcalls_front_remove. In
> this case, I was thinking of reusing the mutex internal counter for
> efficiency, instead of adding one more refcount.
> 
> For using the mutex as a refcount, there is really no need to call
> mutex_trylock or mutex_lock. I suggest checking on the mutex counter
> directly:
> 
> 
> 		while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
> 		       atomic_long_read(&map->active.out_mutex.owner) != 0UL)
> 			cpu_relax();
> 
> Cheers,
> 
> Stefano
> 
> 
> ---
> 
> xen/pvcalls: fix potential endless loop in pvcalls-front.c
> 
> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> take in_mutex on the first try, but you can't take out_mutex. Next time
> you call mutex_trylock() in_mutex is going to fail. It's an endless
> loop.
> 
> Actually, we don't want to use mutex_trylock at all: we don't need to
> take the mutex, we only need to wait until the last mutex waiter/holder
> releases it.
> 
> Instead of calling mutex_trylock or mutex_lock, just check on the mutex
> refcount instead.
> 
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
> CC: boris.ostrovsky@oracle.com
> CC: jgross@suse.com
> 
> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> index 0c1ec68..9f33cb8 100644
> --- a/drivers/xen/pvcalls-front.c
> +++ b/drivers/xen/pvcalls-front.c
> @@ -1048,8 +1048,8 @@ int pvcalls_front_release(struct socket *sock)
>  		 * is set to NULL -- we only need to wait for the existing
>  		 * waiters to return.
>  		 */
> -		while (!mutex_trylock(&map->active.in_mutex) ||
> -			   !mutex_trylock(&map->active.out_mutex))
> +		while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
> +		       atomic_long_read(&map->active.out_mutex.owner) != 0UL)

I don't like this.

Can't you use a kref here? Even if it looks like more overhead it is
much cleaner. There will be no questions regarding possible races,
while with an approach like yours will always smell racy (can't there
be someone taking the mutex just after above test?).

In no case you should make use of the mutex internals.


Juergen

Stefano Stabellini Nov. 13, 2017, 6:30 p.m. UTC | #3

On Fri, 10 Nov 2017, Boris Ostrovsky wrote:
> On 11/10/2017 06:57 PM, Stefano Stabellini wrote:
> > On Tue, 7 Nov 2017, Juergen Gross wrote:
> > > On 06/11/17 23:17, Stefano Stabellini wrote:
> > > > mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> > > > take in_mutex on the first try, but you can't take out_mutex. Next times
> > > > you call mutex_trylock() in_mutex is going to fail. It's an endless
> > > > loop.
> > > > 
> > > > Solve the problem by moving the two mutex_trylock calls to two separate
> > > > loops.
> > > > 
> > > > Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> > > > Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
> > > > CC: boris.ostrovsky@oracle.com
> > > > CC: jgross@suse.com
> > > > ---
> > > >   drivers/xen/pvcalls-front.c | 5 +++--
> > > >   1 file changed, 3 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> > > > index 0c1ec68..047dce7 100644
> > > > --- a/drivers/xen/pvcalls-front.c
> > > > +++ b/drivers/xen/pvcalls-front.c
> > > > @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
> > > >   		 * is set to NULL -- we only need to wait for the
> > > > existing
> > > >   		 * waiters to return.
> > > >   		 */
> > > > -		while (!mutex_trylock(&map->active.in_mutex) ||
> > > > -			   !mutex_trylock(&map->active.out_mutex))
> > > > +		while (!mutex_trylock(&map->active.in_mutex))
> > > > +			cpu_relax();
> > > > +		while (!mutex_trylock(&map->active.out_mutex))
> > > >   			cpu_relax();
> > > 
> > > Any reason you don't just use mutex_lock()?
> > 
> > Hi Juergen, sorry for the late reply.
> > 
> > Yes, you are right. Given the patch, it would be just the same to use
> > mutex_lock.
> > 
> > This is where I realized that actually we have a problem: no matter if
> > we use mutex_lock or mutex_trylock, there are no guarantees that we'll
> > be the last to take the in/out_mutex. Other waiters could be still
> > outstanding.
> > 
> > We solved the same problem using a refcount in pvcalls_front_remove. In
> > this case, I was thinking of reusing the mutex internal counter for
> > efficiency, instead of adding one more refcount.
> > 
> > For using the mutex as a refcount, there is really no need to call
> > mutex_trylock or mutex_lock. I suggest checking on the mutex counter
> > directly:
> > 
> 
> 
> I think you want to say
> 
> 	while(mutex_locked(&map->active.in_mutex.owner) ||
> 	      mutex_locked(&map->active.out_mutex.owner))
> 		cpu_relax();
> 
> since owner being NULL is not a documented value of a free mutex.
> 

You are right, and the function name is "mutex_is_locked", so it would
be:


 	while(mutex_is_locked(&map->active.in_mutex.owner) ||
 	      mutex_is_locked(&map->active.out_mutex.owner))
 		cpu_relax();


> > 
> > 		while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
> > 		       atomic_long_read(&map->active.out_mutex.owner) != 0UL)
> > 			cpu_relax();
> > 
> > Cheers,
> > 
> > Stefano
> > 
> > 
> > ---
> > 
> > xen/pvcalls: fix potential endless loop in pvcalls-front.c
> > 
> > mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> > take in_mutex on the first try, but you can't take out_mutex. Next time
> > you call mutex_trylock() in_mutex is going to fail. It's an endless
> > loop.
> > 
> > Actually, we don't want to use mutex_trylock at all: we don't need to
> > take the mutex, we only need to wait until the last mutex waiter/holder
> > releases it.
> > 
> > Instead of calling mutex_trylock or mutex_lock, just check on the mutex
> > refcount instead.
> > 
> > Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> > Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
> > CC: boris.ostrovsky@oracle.com
> > CC: jgross@suse.com
> > 
> > diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> > index 0c1ec68..9f33cb8 100644
> > --- a/drivers/xen/pvcalls-front.c
> > +++ b/drivers/xen/pvcalls-front.c
> > @@ -1048,8 +1048,8 @@ int pvcalls_front_release(struct socket *sock)
> >   		 * is set to NULL -- we only need to wait for the existing
> >   		 * waiters to return.
> >   		 */
> > -		while (!mutex_trylock(&map->active.in_mutex) ||
> > -			   !mutex_trylock(&map->active.out_mutex))
> > +		while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
> > +		       atomic_long_read(&map->active.out_mutex.owner) != 0UL)
> >   			cpu_relax();
> >     		pvcalls_front_free_map(bedata, map);
> > 
>

Stefano Stabellini Nov. 13, 2017, 6:33 p.m. UTC | #4

On Mon, 13 Nov 2017, Juergen Gross wrote:
> On 11/11/17 00:57, Stefano Stabellini wrote:
> > On Tue, 7 Nov 2017, Juergen Gross wrote:
> >> On 06/11/17 23:17, Stefano Stabellini wrote:
> >>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> >>> take in_mutex on the first try, but you can't take out_mutex. Next times
> >>> you call mutex_trylock() in_mutex is going to fail. It's an endless
> >>> loop.
> >>>
> >>> Solve the problem by moving the two mutex_trylock calls to two separate
> >>> loops.
> >>>
> >>> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> >>> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
> >>> CC: boris.ostrovsky@oracle.com
> >>> CC: jgross@suse.com
> >>> ---
> >>>  drivers/xen/pvcalls-front.c | 5 +++--
> >>>  1 file changed, 3 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> >>> index 0c1ec68..047dce7 100644
> >>> --- a/drivers/xen/pvcalls-front.c
> >>> +++ b/drivers/xen/pvcalls-front.c
> >>> @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
> >>>  		 * is set to NULL -- we only need to wait for the existing
> >>>  		 * waiters to return.
> >>>  		 */
> >>> -		while (!mutex_trylock(&map->active.in_mutex) ||
> >>> -			   !mutex_trylock(&map->active.out_mutex))
> >>> +		while (!mutex_trylock(&map->active.in_mutex))
> >>> +			cpu_relax();
> >>> +		while (!mutex_trylock(&map->active.out_mutex))
> >>>  			cpu_relax();
> >>
> >> Any reason you don't just use mutex_lock()?
> > 
> > Hi Juergen, sorry for the late reply.
> > 
> > Yes, you are right. Given the patch, it would be just the same to use
> > mutex_lock.
> > 
> > This is where I realized that actually we have a problem: no matter if
> > we use mutex_lock or mutex_trylock, there are no guarantees that we'll
> > be the last to take the in/out_mutex. Other waiters could be still
> > outstanding.
> > 
> > We solved the same problem using a refcount in pvcalls_front_remove. In
> > this case, I was thinking of reusing the mutex internal counter for
> > efficiency, instead of adding one more refcount.
> > 
> > For using the mutex as a refcount, there is really no need to call
> > mutex_trylock or mutex_lock. I suggest checking on the mutex counter
> > directly:
> > 
> > 
> > 		while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
> > 		       atomic_long_read(&map->active.out_mutex.owner) != 0UL)
> > 			cpu_relax();
> > 
> > Cheers,
> > 
> > Stefano
> > 
> > 
> > ---
> > 
> > xen/pvcalls: fix potential endless loop in pvcalls-front.c
> > 
> > mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
> > take in_mutex on the first try, but you can't take out_mutex. Next time
> > you call mutex_trylock() in_mutex is going to fail. It's an endless
> > loop.
> > 
> > Actually, we don't want to use mutex_trylock at all: we don't need to
> > take the mutex, we only need to wait until the last mutex waiter/holder
> > releases it.
> > 
> > Instead of calling mutex_trylock or mutex_lock, just check on the mutex
> > refcount instead.
> > 
> > Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> > Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
> > CC: boris.ostrovsky@oracle.com
> > CC: jgross@suse.com
> > 
> > diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> > index 0c1ec68..9f33cb8 100644
> > --- a/drivers/xen/pvcalls-front.c
> > +++ b/drivers/xen/pvcalls-front.c
> > @@ -1048,8 +1048,8 @@ int pvcalls_front_release(struct socket *sock)
> >  		 * is set to NULL -- we only need to wait for the existing
> >  		 * waiters to return.
> >  		 */
> > -		while (!mutex_trylock(&map->active.in_mutex) ||
> > -			   !mutex_trylock(&map->active.out_mutex))
> > +		while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
> > +		       atomic_long_read(&map->active.out_mutex.owner) != 0UL)
> 
> I don't like this.
> 
> Can't you use a kref here? Even if it looks like more overhead it is
> much cleaner. There will be no questions regarding possible races,
> while with an approach like yours will always smell racy (can't there
> be someone taking the mutex just after above test?).
> 
> In no case you should make use of the mutex internals.

Boris' suggestion solves that problem well. Would you be OK with the
proposed

        while(mutex_is_locked(&map->active.in_mutex.owner) ||
              mutex_is_locked(&map->active.out_mutex.owner))
                cpu_relax();

?

Juergen Gross Nov. 14, 2017, 9:11 a.m. UTC | #5

On 13/11/17 19:33, Stefano Stabellini wrote:
> On Mon, 13 Nov 2017, Juergen Gross wrote:
>> On 11/11/17 00:57, Stefano Stabellini wrote:
>>> On Tue, 7 Nov 2017, Juergen Gross wrote:
>>>> On 06/11/17 23:17, Stefano Stabellini wrote:
>>>>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
>>>>> take in_mutex on the first try, but you can't take out_mutex. Next times
>>>>> you call mutex_trylock() in_mutex is going to fail. It's an endless
>>>>> loop.
>>>>>
>>>>> Solve the problem by moving the two mutex_trylock calls to two separate
>>>>> loops.
>>>>>
>>>>> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
>>>>> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>> CC: boris.ostrovsky@oracle.com
>>>>> CC: jgross@suse.com
>>>>> ---
>>>>>  drivers/xen/pvcalls-front.c | 5 +++--
>>>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
>>>>> index 0c1ec68..047dce7 100644
>>>>> --- a/drivers/xen/pvcalls-front.c
>>>>> +++ b/drivers/xen/pvcalls-front.c
>>>>> @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
>>>>>  		 * is set to NULL -- we only need to wait for the existing
>>>>>  		 * waiters to return.
>>>>>  		 */
>>>>> -		while (!mutex_trylock(&map->active.in_mutex) ||
>>>>> -			   !mutex_trylock(&map->active.out_mutex))
>>>>> +		while (!mutex_trylock(&map->active.in_mutex))
>>>>> +			cpu_relax();
>>>>> +		while (!mutex_trylock(&map->active.out_mutex))
>>>>>  			cpu_relax();
>>>>
>>>> Any reason you don't just use mutex_lock()?
>>>
>>> Hi Juergen, sorry for the late reply.
>>>
>>> Yes, you are right. Given the patch, it would be just the same to use
>>> mutex_lock.
>>>
>>> This is where I realized that actually we have a problem: no matter if
>>> we use mutex_lock or mutex_trylock, there are no guarantees that we'll
>>> be the last to take the in/out_mutex. Other waiters could be still
>>> outstanding.
>>>
>>> We solved the same problem using a refcount in pvcalls_front_remove. In
>>> this case, I was thinking of reusing the mutex internal counter for
>>> efficiency, instead of adding one more refcount.
>>>
>>> For using the mutex as a refcount, there is really no need to call
>>> mutex_trylock or mutex_lock. I suggest checking on the mutex counter
>>> directly:
>>>
>>>
>>> 		while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
>>> 		       atomic_long_read(&map->active.out_mutex.owner) != 0UL)
>>> 			cpu_relax();
>>>
>>> Cheers,
>>>
>>> Stefano
>>>
>>>
>>> ---
>>>
>>> xen/pvcalls: fix potential endless loop in pvcalls-front.c
>>>
>>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
>>> take in_mutex on the first try, but you can't take out_mutex. Next time
>>> you call mutex_trylock() in_mutex is going to fail. It's an endless
>>> loop.
>>>
>>> Actually, we don't want to use mutex_trylock at all: we don't need to
>>> take the mutex, we only need to wait until the last mutex waiter/holder
>>> releases it.
>>>
>>> Instead of calling mutex_trylock or mutex_lock, just check on the mutex
>>> refcount instead.
>>>
>>> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
>>> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
>>> CC: boris.ostrovsky@oracle.com
>>> CC: jgross@suse.com
>>>
>>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
>>> index 0c1ec68..9f33cb8 100644
>>> --- a/drivers/xen/pvcalls-front.c
>>> +++ b/drivers/xen/pvcalls-front.c
>>> @@ -1048,8 +1048,8 @@ int pvcalls_front_release(struct socket *sock)
>>>  		 * is set to NULL -- we only need to wait for the existing
>>>  		 * waiters to return.
>>>  		 */
>>> -		while (!mutex_trylock(&map->active.in_mutex) ||
>>> -			   !mutex_trylock(&map->active.out_mutex))
>>> +		while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
>>> +		       atomic_long_read(&map->active.out_mutex.owner) != 0UL)
>>
>> I don't like this.
>>
>> Can't you use a kref here? Even if it looks like more overhead it is
>> much cleaner. There will be no questions regarding possible races,
>> while with an approach like yours will always smell racy (can't there
>> be someone taking the mutex just after above test?).
>>
>> In no case you should make use of the mutex internals.
> 
> Boris' suggestion solves that problem well. Would you be OK with the
> proposed
> 
>         while(mutex_is_locked(&map->active.in_mutex.owner) ||
>               mutex_is_locked(&map->active.out_mutex.owner))
>                 cpu_relax();
> 
> ?

I'm not convinced there isn't a race.

In pvcalls_front_recvmsg() sock->sk->sk_send_head is being read and only
then in_mutex is taken. What happens if pvcalls_front_release() resets
sk_send_head and manages to test the mutex before the mutex is locked?

Even in case this is impossible: the whole construct seems to be rather
fragile.


Juergen

Boris Ostrovsky Nov. 14, 2017, 9:46 p.m. UTC | #6

On 11/14/2017 04:11 AM, Juergen Gross wrote:
> On 13/11/17 19:33, Stefano Stabellini wrote:
>> On Mon, 13 Nov 2017, Juergen Gross wrote:
>>> On 11/11/17 00:57, Stefano Stabellini wrote:
>>>> On Tue, 7 Nov 2017, Juergen Gross wrote:
>>>>> On 06/11/17 23:17, Stefano Stabellini wrote:
>>>>>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
>>>>>> take in_mutex on the first try, but you can't take out_mutex. Next times
>>>>>> you call mutex_trylock() in_mutex is going to fail. It's an endless
>>>>>> loop.
>>>>>>
>>>>>> Solve the problem by moving the two mutex_trylock calls to two separate
>>>>>> loops.
>>>>>>
>>>>>> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
>>>>>> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>>> CC: boris.ostrovsky@oracle.com
>>>>>> CC: jgross@suse.com
>>>>>> ---
>>>>>>  drivers/xen/pvcalls-front.c | 5 +++--
>>>>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
>>>>>> index 0c1ec68..047dce7 100644
>>>>>> --- a/drivers/xen/pvcalls-front.c
>>>>>> +++ b/drivers/xen/pvcalls-front.c
>>>>>> @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
>>>>>>  		 * is set to NULL -- we only need to wait for the existing
>>>>>>  		 * waiters to return.
>>>>>>  		 */
>>>>>> -		while (!mutex_trylock(&map->active.in_mutex) ||
>>>>>> -			   !mutex_trylock(&map->active.out_mutex))
>>>>>> +		while (!mutex_trylock(&map->active.in_mutex))
>>>>>> +			cpu_relax();
>>>>>> +		while (!mutex_trylock(&map->active.out_mutex))
>>>>>>  			cpu_relax();
>>>>> Any reason you don't just use mutex_lock()?
>>>> Hi Juergen, sorry for the late reply.
>>>>
>>>> Yes, you are right. Given the patch, it would be just the same to use
>>>> mutex_lock.
>>>>
>>>> This is where I realized that actually we have a problem: no matter if
>>>> we use mutex_lock or mutex_trylock, there are no guarantees that we'll
>>>> be the last to take the in/out_mutex. Other waiters could be still
>>>> outstanding.
>>>>
>>>> We solved the same problem using a refcount in pvcalls_front_remove. In
>>>> this case, I was thinking of reusing the mutex internal counter for
>>>> efficiency, instead of adding one more refcount.
>>>>
>>>> For using the mutex as a refcount, there is really no need to call
>>>> mutex_trylock or mutex_lock. I suggest checking on the mutex counter
>>>> directly:
>>>>
>>>>
>>>> 		while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
>>>> 		       atomic_long_read(&map->active.out_mutex.owner) != 0UL)
>>>> 			cpu_relax();
>>>>
>>>> Cheers,
>>>>
>>>> Stefano
>>>>
>>>>
>>>> ---
>>>>
>>>> xen/pvcalls: fix potential endless loop in pvcalls-front.c
>>>>
>>>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
>>>> take in_mutex on the first try, but you can't take out_mutex. Next time
>>>> you call mutex_trylock() in_mutex is going to fail. It's an endless
>>>> loop.
>>>>
>>>> Actually, we don't want to use mutex_trylock at all: we don't need to
>>>> take the mutex, we only need to wait until the last mutex waiter/holder
>>>> releases it.
>>>>
>>>> Instead of calling mutex_trylock or mutex_lock, just check on the mutex
>>>> refcount instead.
>>>>
>>>> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
>>>> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
>>>> CC: boris.ostrovsky@oracle.com
>>>> CC: jgross@suse.com
>>>>
>>>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
>>>> index 0c1ec68..9f33cb8 100644
>>>> --- a/drivers/xen/pvcalls-front.c
>>>> +++ b/drivers/xen/pvcalls-front.c
>>>> @@ -1048,8 +1048,8 @@ int pvcalls_front_release(struct socket *sock)
>>>>  		 * is set to NULL -- we only need to wait for the existing
>>>>  		 * waiters to return.
>>>>  		 */
>>>> -		while (!mutex_trylock(&map->active.in_mutex) ||
>>>> -			   !mutex_trylock(&map->active.out_mutex))
>>>> +		while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
>>>> +		       atomic_long_read(&map->active.out_mutex.owner) != 0UL)
>>> I don't like this.
>>>
>>> Can't you use a kref here? Even if it looks like more overhead it is
>>> much cleaner. There will be no questions regarding possible races,
>>> while with an approach like yours will always smell racy (can't there
>>> be someone taking the mutex just after above test?).
>>>
>>> In no case you should make use of the mutex internals.
>> Boris' suggestion solves that problem well. Would you be OK with the
>> proposed
>>
>>         while(mutex_is_locked(&map->active.in_mutex.owner) ||
>>               mutex_is_locked(&map->active.out_mutex.owner))
>>                 cpu_relax();
>>
>> ?
> I'm not convinced there isn't a race.
>
> In pvcalls_front_recvmsg() sock->sk->sk_send_head is being read and only
> then in_mutex is taken. What happens if pvcalls_front_release() resets
> sk_send_head and manages to test the mutex before the mutex is locked?
>
> Even in case this is impossible: the whole construct seems to be rather
> fragile.


I think we can wait until pvcalls_refcount is 1 (i.e. it's only us) and
not rely on mutex state.

In any case, we are in the merge window now and I don't want to send
pull request with a known bug so we need to have this resolved in the
next couple of days.

-boris

Juergen Gross Nov. 15, 2017, 8:14 a.m. UTC | #7

On 14/11/17 22:46, Boris Ostrovsky wrote:
> On 11/14/2017 04:11 AM, Juergen Gross wrote:
>> On 13/11/17 19:33, Stefano Stabellini wrote:
>>> On Mon, 13 Nov 2017, Juergen Gross wrote:
>>>> On 11/11/17 00:57, Stefano Stabellini wrote:
>>>>> On Tue, 7 Nov 2017, Juergen Gross wrote:
>>>>>> On 06/11/17 23:17, Stefano Stabellini wrote:
>>>>>>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
>>>>>>> take in_mutex on the first try, but you can't take out_mutex. Next times
>>>>>>> you call mutex_trylock() in_mutex is going to fail. It's an endless
>>>>>>> loop.
>>>>>>>
>>>>>>> Solve the problem by moving the two mutex_trylock calls to two separate
>>>>>>> loops.
>>>>>>>
>>>>>>> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
>>>>>>> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>>>> CC: boris.ostrovsky@oracle.com
>>>>>>> CC: jgross@suse.com
>>>>>>> ---
>>>>>>>  drivers/xen/pvcalls-front.c | 5 +++--
>>>>>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
>>>>>>> index 0c1ec68..047dce7 100644
>>>>>>> --- a/drivers/xen/pvcalls-front.c
>>>>>>> +++ b/drivers/xen/pvcalls-front.c
>>>>>>> @@ -1048,8 +1048,9 @@ int pvcalls_front_release(struct socket *sock)
>>>>>>>  		 * is set to NULL -- we only need to wait for the existing
>>>>>>>  		 * waiters to return.
>>>>>>>  		 */
>>>>>>> -		while (!mutex_trylock(&map->active.in_mutex) ||
>>>>>>> -			   !mutex_trylock(&map->active.out_mutex))
>>>>>>> +		while (!mutex_trylock(&map->active.in_mutex))
>>>>>>> +			cpu_relax();
>>>>>>> +		while (!mutex_trylock(&map->active.out_mutex))
>>>>>>>  			cpu_relax();
>>>>>> Any reason you don't just use mutex_lock()?
>>>>> Hi Juergen, sorry for the late reply.
>>>>>
>>>>> Yes, you are right. Given the patch, it would be just the same to use
>>>>> mutex_lock.
>>>>>
>>>>> This is where I realized that actually we have a problem: no matter if
>>>>> we use mutex_lock or mutex_trylock, there are no guarantees that we'll
>>>>> be the last to take the in/out_mutex. Other waiters could be still
>>>>> outstanding.
>>>>>
>>>>> We solved the same problem using a refcount in pvcalls_front_remove. In
>>>>> this case, I was thinking of reusing the mutex internal counter for
>>>>> efficiency, instead of adding one more refcount.
>>>>>
>>>>> For using the mutex as a refcount, there is really no need to call
>>>>> mutex_trylock or mutex_lock. I suggest checking on the mutex counter
>>>>> directly:
>>>>>
>>>>>
>>>>> 		while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
>>>>> 		       atomic_long_read(&map->active.out_mutex.owner) != 0UL)
>>>>> 			cpu_relax();
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Stefano
>>>>>
>>>>>
>>>>> ---
>>>>>
>>>>> xen/pvcalls: fix potential endless loop in pvcalls-front.c
>>>>>
>>>>> mutex_trylock() returns 1 if you take the lock and 0 if not. Assume you
>>>>> take in_mutex on the first try, but you can't take out_mutex. Next time
>>>>> you call mutex_trylock() in_mutex is going to fail. It's an endless
>>>>> loop.
>>>>>
>>>>> Actually, we don't want to use mutex_trylock at all: we don't need to
>>>>> take the mutex, we only need to wait until the last mutex waiter/holder
>>>>> releases it.
>>>>>
>>>>> Instead of calling mutex_trylock or mutex_lock, just check on the mutex
>>>>> refcount instead.
>>>>>
>>>>> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
>>>>> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
>>>>> CC: boris.ostrovsky@oracle.com
>>>>> CC: jgross@suse.com
>>>>>
>>>>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
>>>>> index 0c1ec68..9f33cb8 100644
>>>>> --- a/drivers/xen/pvcalls-front.c
>>>>> +++ b/drivers/xen/pvcalls-front.c
>>>>> @@ -1048,8 +1048,8 @@ int pvcalls_front_release(struct socket *sock)
>>>>>  		 * is set to NULL -- we only need to wait for the existing
>>>>>  		 * waiters to return.
>>>>>  		 */
>>>>> -		while (!mutex_trylock(&map->active.in_mutex) ||
>>>>> -			   !mutex_trylock(&map->active.out_mutex))
>>>>> +		while (atomic_long_read(&map->active.in_mutex.owner) != 0UL ||
>>>>> +		       atomic_long_read(&map->active.out_mutex.owner) != 0UL)
>>>> I don't like this.
>>>>
>>>> Can't you use a kref here? Even if it looks like more overhead it is
>>>> much cleaner. There will be no questions regarding possible races,
>>>> while with an approach like yours will always smell racy (can't there
>>>> be someone taking the mutex just after above test?).
>>>>
>>>> In no case you should make use of the mutex internals.
>>> Boris' suggestion solves that problem well. Would you be OK with the
>>> proposed
>>>
>>>         while(mutex_is_locked(&map->active.in_mutex.owner) ||
>>>               mutex_is_locked(&map->active.out_mutex.owner))
>>>                 cpu_relax();
>>>
>>> ?
>> I'm not convinced there isn't a race.
>>
>> In pvcalls_front_recvmsg() sock->sk->sk_send_head is being read and only
>> then in_mutex is taken. What happens if pvcalls_front_release() resets
>> sk_send_head and manages to test the mutex before the mutex is locked?
>>
>> Even in case this is impossible: the whole construct seems to be rather
>> fragile.
> 
> 
> I think we can wait until pvcalls_refcount is 1 (i.e. it's only us) and
> not rely on mutex state.

Yes, this would work.


Juergen

xen/pvcalls: fix potential endless loop in pvcalls-front.c

Commit Message

Comments

Patch