Message ID | 20181206111141.9840-1-yuval.shaia@oracle.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | IB/rxe: Remove duplicate pool invalidation | expand |
On Thu, 2018-12-06 at 13:11 +0200, Yuval Shaia wrote: > Pool state is set to 'invalid' indirectly by function rxe_pool_put which > is called anyway here so no need to update the state twice. > > Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> > --- > drivers/infiniband/sw/rxe/rxe_pool.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c > index 66728086169b..cfe8051c2683 100644 > --- a/drivers/infiniband/sw/rxe/rxe_pool.c > +++ b/drivers/infiniband/sw/rxe/rxe_pool.c > @@ -248,7 +248,6 @@ void rxe_pool_cleanup(struct rxe_pool *pool) > unsigned long flags; > > write_lock_irqsave(&pool->pool_lock, flags); > - pool->state = RXE_POOL_STATE_INVALID; > if (atomic_read(&pool->num_elem) > 0) > pr_warn("%s pool destroyed with unfree'd elem\n", > pool_name(pool)); rxe_pool_put() only causes the pool state to change after the pool reference count has dropped to zero. So I think the pool state change in rxe_pool_cleanup() is helpful to catch use-after-free of a pool. Bart.
On Thu, Dec 06, 2018 at 09:47:01AM -0800, Bart Van Assche wrote: > On Thu, 2018-12-06 at 13:11 +0200, Yuval Shaia wrote: > > Pool state is set to 'invalid' indirectly by function rxe_pool_put which > > is called anyway here so no need to update the state twice. > > > > Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> > > --- > > drivers/infiniband/sw/rxe/rxe_pool.c | 1 - > > 1 file changed, 1 deletion(-) > > > > diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c > > index 66728086169b..cfe8051c2683 100644 > > --- a/drivers/infiniband/sw/rxe/rxe_pool.c > > +++ b/drivers/infiniband/sw/rxe/rxe_pool.c > > @@ -248,7 +248,6 @@ void rxe_pool_cleanup(struct rxe_pool *pool) > > unsigned long flags; > > > > write_lock_irqsave(&pool->pool_lock, flags); > > - pool->state = RXE_POOL_STATE_INVALID; > > if (atomic_read(&pool->num_elem) > 0) > > pr_warn("%s pool destroyed with unfree'd elem\n", > > pool_name(pool)); > > rxe_pool_put() only causes the pool state to change after the pool reference > count has dropped to zero. So I think the pool state change in rxe_pool_cleanup() > is helpful to catch use-after-free of a pool. But isn't it the idea with using the ref-count, i.e. the pool *is not* cleaned until all reference are freed? At least from the flow it looks like that pool is valid until last reference is released so why declaring that it is not? If we want pool to be invalid in the first call to cleanup then let's drop this entire ref-counting. I'm probably missing something here so will be glad to understand the logic better. > > Bart.
On 12/8/18 11:24 PM, Yuval Shaia wrote: > On Thu, Dec 06, 2018 at 09:47:01AM -0800, Bart Van Assche wrote: >> On Thu, 2018-12-06 at 13:11 +0200, Yuval Shaia wrote: >>> Pool state is set to 'invalid' indirectly by function rxe_pool_put which >>> is called anyway here so no need to update the state twice. >>> >>> Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> >>> --- >>> drivers/infiniband/sw/rxe/rxe_pool.c | 1 - >>> 1 file changed, 1 deletion(-) >>> >>> diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c >>> index 66728086169b..cfe8051c2683 100644 >>> --- a/drivers/infiniband/sw/rxe/rxe_pool.c >>> +++ b/drivers/infiniband/sw/rxe/rxe_pool.c >>> @@ -248,7 +248,6 @@ void rxe_pool_cleanup(struct rxe_pool *pool) >>> unsigned long flags; >>> >>> write_lock_irqsave(&pool->pool_lock, flags); >>> - pool->state = RXE_POOL_STATE_INVALID; >>> if (atomic_read(&pool->num_elem) > 0) >>> pr_warn("%s pool destroyed with unfree'd elem\n", >>> pool_name(pool)); >> >> rxe_pool_put() only causes the pool state to change after the pool reference >> count has dropped to zero. So I think the pool state change in rxe_pool_cleanup() >> is helpful to catch use-after-free of a pool. > > But isn't it the idea with using the ref-count, i.e. the pool *is not* > cleaned until all reference are freed? > > At least from the flow it looks like that pool is valid until last > reference is released so why declaring that it is not? > > If we want pool to be invalid in the first call to cleanup then let's drop > this entire ref-counting. > > I'm probably missing something here so will be glad to understand the logic > better. My understanding is that once rxe_pool_cleanup() has been called neither rxe_alloc() nor rxe_pool_get_index() nor rxe_pool_get_key() shouldn't be called anymore. I think that's what the "invalid" pool state represents. Bart.
On Sun, Dec 09, 2018 at 09:24:37AM +0200, Yuval Shaia wrote: > On Thu, Dec 06, 2018 at 09:47:01AM -0800, Bart Van Assche wrote: > > On Thu, 2018-12-06 at 13:11 +0200, Yuval Shaia wrote: > > > Pool state is set to 'invalid' indirectly by function rxe_pool_put which > > > is called anyway here so no need to update the state twice. > > > > > > Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> > > > drivers/infiniband/sw/rxe/rxe_pool.c | 1 - > > > 1 file changed, 1 deletion(-) > > > > > > diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c > > > index 66728086169b..cfe8051c2683 100644 > > > +++ b/drivers/infiniband/sw/rxe/rxe_pool.c > > > @@ -248,7 +248,6 @@ void rxe_pool_cleanup(struct rxe_pool *pool) > > > unsigned long flags; > > > > > > write_lock_irqsave(&pool->pool_lock, flags); > > > - pool->state = RXE_POOL_STATE_INVALID; > > > if (atomic_read(&pool->num_elem) > 0) > > > pr_warn("%s pool destroyed with unfree'd elem\n", > > > pool_name(pool)); > > > > rxe_pool_put() only causes the pool state to change after the pool reference > > count has dropped to zero. So I think the pool state change in rxe_pool_cleanup() > > is helpful to catch use-after-free of a pool. > > But isn't it the idea with using the ref-count, i.e. the pool *is not* > cleaned until all reference are freed? > > At least from the flow it looks like that pool is valid until last > reference is released so why declaring that it is not? > > If we want pool to be invalid in the first call to cleanup then let's drop > this entire ref-counting. This is what should be done, yes. The driver *must* fully clean up everything related to it during unregister. Otherwise it will have module unload races and other bad bugs. So if a refcount is used it must be bounded and fenced to zero as part of cleanup. Jason
diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c index 66728086169b..cfe8051c2683 100644 --- a/drivers/infiniband/sw/rxe/rxe_pool.c +++ b/drivers/infiniband/sw/rxe/rxe_pool.c @@ -248,7 +248,6 @@ void rxe_pool_cleanup(struct rxe_pool *pool) unsigned long flags; write_lock_irqsave(&pool->pool_lock, flags); - pool->state = RXE_POOL_STATE_INVALID; if (atomic_read(&pool->num_elem) > 0) pr_warn("%s pool destroyed with unfree'd elem\n", pool_name(pool));
Pool state is set to 'invalid' indirectly by function rxe_pool_put which is called anyway here so no need to update the state twice. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> --- drivers/infiniband/sw/rxe/rxe_pool.c | 1 - 1 file changed, 1 deletion(-)