diff mbox series

IB/rxe: Remove duplicate pool invalidation

Message ID 20181206111141.9840-1-yuval.shaia@oracle.com (mailing list archive)
State Changes Requested
Headers show
Series IB/rxe: Remove duplicate pool invalidation | expand

Commit Message

Yuval Shaia Dec. 6, 2018, 11:11 a.m. UTC
Pool state is set to 'invalid' indirectly by function rxe_pool_put which
is called anyway here so no need to update the state twice.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
---
 drivers/infiniband/sw/rxe/rxe_pool.c | 1 -
 1 file changed, 1 deletion(-)

Comments

Bart Van Assche Dec. 6, 2018, 5:47 p.m. UTC | #1
On Thu, 2018-12-06 at 13:11 +0200, Yuval Shaia wrote:
> Pool state is set to 'invalid' indirectly by function rxe_pool_put which
> is called anyway here so no need to update the state twice.
> 
> Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
> ---
>  drivers/infiniband/sw/rxe/rxe_pool.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c
> index 66728086169b..cfe8051c2683 100644
> --- a/drivers/infiniband/sw/rxe/rxe_pool.c
> +++ b/drivers/infiniband/sw/rxe/rxe_pool.c
> @@ -248,7 +248,6 @@ void rxe_pool_cleanup(struct rxe_pool *pool)
>  	unsigned long flags;
>  
>  	write_lock_irqsave(&pool->pool_lock, flags);
> -	pool->state = RXE_POOL_STATE_INVALID;
>  	if (atomic_read(&pool->num_elem) > 0)
>  		pr_warn("%s pool destroyed with unfree'd elem\n",
>  			pool_name(pool));

rxe_pool_put() only causes the pool state to change after the pool reference
count has dropped to zero. So I think the pool state change in rxe_pool_cleanup()
is helpful to catch use-after-free of a pool.

Bart.
Yuval Shaia Dec. 9, 2018, 7:24 a.m. UTC | #2
On Thu, Dec 06, 2018 at 09:47:01AM -0800, Bart Van Assche wrote:
> On Thu, 2018-12-06 at 13:11 +0200, Yuval Shaia wrote:
> > Pool state is set to 'invalid' indirectly by function rxe_pool_put which
> > is called anyway here so no need to update the state twice.
> > 
> > Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
> > ---
> >  drivers/infiniband/sw/rxe/rxe_pool.c | 1 -
> >  1 file changed, 1 deletion(-)
> > 
> > diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c
> > index 66728086169b..cfe8051c2683 100644
> > --- a/drivers/infiniband/sw/rxe/rxe_pool.c
> > +++ b/drivers/infiniband/sw/rxe/rxe_pool.c
> > @@ -248,7 +248,6 @@ void rxe_pool_cleanup(struct rxe_pool *pool)
> >  	unsigned long flags;
> >  
> >  	write_lock_irqsave(&pool->pool_lock, flags);
> > -	pool->state = RXE_POOL_STATE_INVALID;
> >  	if (atomic_read(&pool->num_elem) > 0)
> >  		pr_warn("%s pool destroyed with unfree'd elem\n",
> >  			pool_name(pool));
> 
> rxe_pool_put() only causes the pool state to change after the pool reference
> count has dropped to zero. So I think the pool state change in rxe_pool_cleanup()
> is helpful to catch use-after-free of a pool.

But isn't it the idea with using the ref-count, i.e. the pool *is not*
cleaned until all reference are freed?

At least from the flow it looks like that pool is valid until last
reference is released so why declaring that it is not?

If we want pool to be invalid in the first call to cleanup then let's drop
this entire ref-counting.

I'm probably missing something here so will be glad to understand the logic
better.

> 
> Bart.
Bart Van Assche Dec. 9, 2018, 4:25 p.m. UTC | #3
On 12/8/18 11:24 PM, Yuval Shaia wrote:
> On Thu, Dec 06, 2018 at 09:47:01AM -0800, Bart Van Assche wrote:
>> On Thu, 2018-12-06 at 13:11 +0200, Yuval Shaia wrote:
>>> Pool state is set to 'invalid' indirectly by function rxe_pool_put which
>>> is called anyway here so no need to update the state twice.
>>>
>>> Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
>>> ---
>>>   drivers/infiniband/sw/rxe/rxe_pool.c | 1 -
>>>   1 file changed, 1 deletion(-)
>>>
>>> diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c
>>> index 66728086169b..cfe8051c2683 100644
>>> --- a/drivers/infiniband/sw/rxe/rxe_pool.c
>>> +++ b/drivers/infiniband/sw/rxe/rxe_pool.c
>>> @@ -248,7 +248,6 @@ void rxe_pool_cleanup(struct rxe_pool *pool)
>>>   	unsigned long flags;
>>>   
>>>   	write_lock_irqsave(&pool->pool_lock, flags);
>>> -	pool->state = RXE_POOL_STATE_INVALID;
>>>   	if (atomic_read(&pool->num_elem) > 0)
>>>   		pr_warn("%s pool destroyed with unfree'd elem\n",
>>>   			pool_name(pool));
>>
>> rxe_pool_put() only causes the pool state to change after the pool reference
>> count has dropped to zero. So I think the pool state change in rxe_pool_cleanup()
>> is helpful to catch use-after-free of a pool.
> 
> But isn't it the idea with using the ref-count, i.e. the pool *is not*
> cleaned until all reference are freed?
> 
> At least from the flow it looks like that pool is valid until last
> reference is released so why declaring that it is not?
> 
> If we want pool to be invalid in the first call to cleanup then let's drop
> this entire ref-counting.
> 
> I'm probably missing something here so will be glad to understand the logic
> better.

My understanding is that once rxe_pool_cleanup() has been called neither 
rxe_alloc() nor rxe_pool_get_index() nor rxe_pool_get_key() shouldn't be 
called anymore. I think that's what the "invalid" pool state represents.

Bart.
Jason Gunthorpe Dec. 9, 2018, 4:43 p.m. UTC | #4
On Sun, Dec 09, 2018 at 09:24:37AM +0200, Yuval Shaia wrote:
> On Thu, Dec 06, 2018 at 09:47:01AM -0800, Bart Van Assche wrote:
> > On Thu, 2018-12-06 at 13:11 +0200, Yuval Shaia wrote:
> > > Pool state is set to 'invalid' indirectly by function rxe_pool_put which
> > > is called anyway here so no need to update the state twice.
> > > 
> > > Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
> > >  drivers/infiniband/sw/rxe/rxe_pool.c | 1 -
> > >  1 file changed, 1 deletion(-)
> > > 
> > > diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c
> > > index 66728086169b..cfe8051c2683 100644
> > > +++ b/drivers/infiniband/sw/rxe/rxe_pool.c
> > > @@ -248,7 +248,6 @@ void rxe_pool_cleanup(struct rxe_pool *pool)
> > >  	unsigned long flags;
> > >  
> > >  	write_lock_irqsave(&pool->pool_lock, flags);
> > > -	pool->state = RXE_POOL_STATE_INVALID;
> > >  	if (atomic_read(&pool->num_elem) > 0)
> > >  		pr_warn("%s pool destroyed with unfree'd elem\n",
> > >  			pool_name(pool));
> > 
> > rxe_pool_put() only causes the pool state to change after the pool reference
> > count has dropped to zero. So I think the pool state change in rxe_pool_cleanup()
> > is helpful to catch use-after-free of a pool.
> 
> But isn't it the idea with using the ref-count, i.e. the pool *is not*
> cleaned until all reference are freed?
> 
> At least from the flow it looks like that pool is valid until last
> reference is released so why declaring that it is not?
> 
> If we want pool to be invalid in the first call to cleanup then let's drop
> this entire ref-counting.

This is what should be done, yes.

The driver *must* fully clean up everything related to it during
unregister. Otherwise it will have module unload races and other bad
bugs.

So if a refcount is used it must be bounded and fenced to zero as part
of cleanup.

Jason
diff mbox series

Patch

diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c
index 66728086169b..cfe8051c2683 100644
--- a/drivers/infiniband/sw/rxe/rxe_pool.c
+++ b/drivers/infiniband/sw/rxe/rxe_pool.c
@@ -248,7 +248,6 @@  void rxe_pool_cleanup(struct rxe_pool *pool)
 	unsigned long flags;
 
 	write_lock_irqsave(&pool->pool_lock, flags);
-	pool->state = RXE_POOL_STATE_INVALID;
 	if (atomic_read(&pool->num_elem) > 0)
 		pr_warn("%s pool destroyed with unfree'd elem\n",
 			pool_name(pool));