diff mbox

[for-rc,4/6] RDMA/vmw_pvrdma: Use refcount_dec_and_test to avoid warning

Message ID 20171208190218.GA744@bryantan-devbox.prom.eng.vmware.com.prom.eng.vmware.com (mailing list archive)
State Superseded
Headers show

Commit Message

Bryan Tan Dec. 8, 2017, 7:02 p.m. UTC
refcount_dec generates a warning when the operation
causes the refcount to hit zero. Avoid this by using
refcount_dec_and_test.

Fixes: 8b10ba783c9d ("RDMA/vmw_pvrdma: Add shared receive queue support")
Reviewed-by: Adit Ranadive <aditr@vmware.com>
Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: Bryan Tan <bryantan@vmware.com>
---
 drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Jason Gunthorpe Dec. 8, 2017, 11:30 p.m. UTC | #1
On Fri, Dec 08, 2017 at 11:02:24AM -0800, Bryan Tan wrote:
> refcount_dec generates a warning when the operation
> causes the refcount to hit zero. Avoid this by using
> refcount_dec_and_test.
> 
> Fixes: 8b10ba783c9d ("RDMA/vmw_pvrdma: Add shared receive queue support")
> Reviewed-by: Adit Ranadive <aditr@vmware.com>
> Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
> Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
> Signed-off-by: Bryan Tan <bryantan@vmware.com>
>  drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c
> index 826ccb8..a2b1a3c 100644
> +++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c
> @@ -236,8 +236,8 @@ static void pvrdma_free_srq(struct pvrdma_dev *dev, struct pvrdma_srq *srq)
>  	dev->srq_tbl[srq->srq_handle] = NULL;
>  	spin_unlock_irqrestore(&dev->srq_tbl_lock, flags);
>  
> -	refcount_dec(&srq->refcnt);
> -	wait_event(srq->wait, !refcount_read(&srq->refcnt));
> +	if (!refcount_dec_and_test(&srq->refcnt))
> +		wait_event(srq->wait, !refcount_read(&srq->refcnt));

I really don't like this idiom for using refcount, refcount should
not be dec'd below 0 even under a dec_and_test..

Why not just simplify:

	wait_event(srq->wait, refcount_read(&srq->refcnt) == 1);

???

Same comment on the other patch switching from atomic to refcount

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bryan Tan Dec. 11, 2017, 6:33 p.m. UTC | #2
On Fri, Dec 08, 2017 at 04:30:49PM -0700, Jason Gunthorpe wrote:
> On Fri, Dec 08, 2017 at 11:02:24AM -0800, Bryan Tan wrote:
> > refcount_dec generates a warning when the operation
> > causes the refcount to hit zero. Avoid this by using
> > refcount_dec_and_test.
> > 
> > Fixes: 8b10ba783c9d ("RDMA/vmw_pvrdma: Add shared receive queue support")
> > Reviewed-by: Adit Ranadive <aditr@vmware.com>
> > Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
> > Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
> > Signed-off-by: Bryan Tan <bryantan@vmware.com>
> >  drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c
> > index 826ccb8..a2b1a3c 100644
> > +++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c
> > @@ -236,8 +236,8 @@ static void pvrdma_free_srq(struct pvrdma_dev *dev, struct pvrdma_srq *srq)
> >  	dev->srq_tbl[srq->srq_handle] = NULL;
> >  	spin_unlock_irqrestore(&dev->srq_tbl_lock, flags);
> >  
> > -	refcount_dec(&srq->refcnt);
> > -	wait_event(srq->wait, !refcount_read(&srq->refcnt));
> > +	if (!refcount_dec_and_test(&srq->refcnt))
> > +		wait_event(srq->wait, !refcount_read(&srq->refcnt));
> 
> I really don't like this idiom for using refcount, refcount should
> not be dec'd below 0 even under a dec_and_test..
> Why not just simplify:
> 
> 	wait_event(srq->wait, refcount_read(&srq->refcnt) == 1);
> 
> ???
> 
> Same comment on the other patch switching from atomic to refcount
> 
> Jason

The refcount doesn't go below 0 in either case--the problem is that
I didn't realise refcount_dec complains about decrements resulting
in a value of 0 [1]. There are no problems with refcount_dec_and_test
resulting in a value of 0.

About checking for refcnt == 1, I do not know of a safe way to only
wake up when the refcount hits 1. Right now we do that by checking
for 0 and doing a wake_up if the result of refcount_dec_and_test is
0 (see the SRQ event handler in pvrdma_main.c). If there's a way to
accomplish this without another dec_and_test in the destroy path, I
can do so.

Bryan

[1] https://patchwork.kernel.org/patch/9863573/ 3rd paragraph
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe Dec. 11, 2017, 8:14 p.m. UTC | #3
On Mon, Dec 11, 2017 at 10:33:15AM -0800, Bryan Tan wrote:
> On Fri, Dec 08, 2017 at 04:30:49PM -0700, Jason Gunthorpe wrote:
> > On Fri, Dec 08, 2017 at 11:02:24AM -0800, Bryan Tan wrote:
> > > refcount_dec generates a warning when the operation
> > > causes the refcount to hit zero. Avoid this by using
> > > refcount_dec_and_test.
> > > 
> > > Fixes: 8b10ba783c9d ("RDMA/vmw_pvrdma: Add shared receive queue support")
> > > Reviewed-by: Adit Ranadive <aditr@vmware.com>
> > > Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
> > > Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
> > > Signed-off-by: Bryan Tan <bryantan@vmware.com>
> > >  drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c
> > > index 826ccb8..a2b1a3c 100644
> > > +++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c
> > > @@ -236,8 +236,8 @@ static void pvrdma_free_srq(struct pvrdma_dev *dev, struct pvrdma_srq *srq)
> > >  	dev->srq_tbl[srq->srq_handle] = NULL;
> > >  	spin_unlock_irqrestore(&dev->srq_tbl_lock, flags);
> > >  
> > > -	refcount_dec(&srq->refcnt);
> > > -	wait_event(srq->wait, !refcount_read(&srq->refcnt));
> > > +	if (!refcount_dec_and_test(&srq->refcnt))
> > > +		wait_event(srq->wait, !refcount_read(&srq->refcnt));
> > 
> > I really don't like this idiom for using refcount, refcount should
> > not be dec'd below 0 even under a dec_and_test..
> > Why not just simplify:
> >
> > 	wait_event(srq->wait, refcount_read(&srq->refcnt) == 1);
> >
> > ???
> >
> > Same comment on the other patch switching from atomic to refcount
> > 
> > Jason
> 
> The refcount doesn't go below 0 in either case--the problem is that
> I didn't realise refcount_dec complains about decrements resulting
> in a value of 0 [1]. There are no problems with refcount_dec_and_test
> resulting in a value of 0.

Thats what I ment with my remark.. Sorry for the confusion

> About checking for refcnt == 1, I do not know of a safe way to only
> wake up when the refcount hits 1. Right now we do that by checking
> for 0 and doing a wake_up if the result of refcount_dec_and_test is
> 0 (see the SRQ event handler in pvrdma_main.c). If there's a way to
> accomplish this without another dec_and_test in the destroy path, I
> can do so.

What is wrong with this:

 	wait_event(srq->wait, refcount_read(&srq->refcnt) == 1);
?

refcount == 1 means the caller is the last owner of the refcount,
presumably you have already taken care to ensure it cannot be inc'd
again (or the code is already not locked right)

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bryan Tan Dec. 11, 2017, 9:59 p.m. UTC | #4
On Mon, Dec 11, 2017 at 01:14:21PM -0700, Jason Gunthorpe wrote:
> On Mon, Dec 11, 2017 at 10:33:15AM -0800, Bryan Tan wrote:
> > On Fri, Dec 08, 2017 at 04:30:49PM -0700, Jason Gunthorpe wrote:
> > > On Fri, Dec 08, 2017 at 11:02:24AM -0800, Bryan Tan wrote:
> > > > refcount_dec generates a warning when the operation
> > > > causes the refcount to hit zero. Avoid this by using
> > > > refcount_dec_and_test.
> > > > 
> > > > Fixes: 8b10ba783c9d ("RDMA/vmw_pvrdma: Add shared receive queue support")
> > > > Reviewed-by: Adit Ranadive <aditr@vmware.com>
> > > > Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
> > > > Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
> > > > Signed-off-by: Bryan Tan <bryantan@vmware.com>
> > > >  drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c | 4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c
> > > > index 826ccb8..a2b1a3c 100644
> > > > +++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c
> > > > @@ -236,8 +236,8 @@ static void pvrdma_free_srq(struct pvrdma_dev *dev, struct pvrdma_srq *srq)
> > > >  	dev->srq_tbl[srq->srq_handle] = NULL;
> > > >  	spin_unlock_irqrestore(&dev->srq_tbl_lock, flags);
> > > >  
> > > > -	refcount_dec(&srq->refcnt);
> > > > -	wait_event(srq->wait, !refcount_read(&srq->refcnt));
> > > > +	if (!refcount_dec_and_test(&srq->refcnt))
> > > > +		wait_event(srq->wait, !refcount_read(&srq->refcnt));
> > > 
> > > I really don't like this idiom for using refcount, refcount should
> > > not be dec'd below 0 even under a dec_and_test..
> > > Why not just simplify:
> > >
> > > 	wait_event(srq->wait, refcount_read(&srq->refcnt) == 1);
> > >
> > > ???
> > >
> > > Same comment on the other patch switching from atomic to refcount
> > > 
> > > Jason
> > 
> > The refcount doesn't go below 0 in either case--the problem is that
> > I didn't realise refcount_dec complains about decrements resulting
> > in a value of 0 [1]. There are no problems with refcount_dec_and_test
> > resulting in a value of 0.
> 
> Thats what I ment with my remark.. Sorry for the confusion
> 
> > About checking for refcnt == 1, I do not know of a safe way to only
> > wake up when the refcount hits 1. Right now we do that by checking
> > for 0 and doing a wake_up if the result of refcount_dec_and_test is
> > 0 (see the SRQ event handler in pvrdma_main.c). If there's a way to
> > accomplish this without another dec_and_test in the destroy path, I
> > can do so.
> 
> What is wrong with this:
> 
>  	wait_event(srq->wait, refcount_read(&srq->refcnt) == 1);
> ?
> 
> refcount == 1 means the caller is the last owner of the refcount,
> presumably you have already taken care to ensure it cannot be inc'd
> again (or the code is already not locked right)

In the SRQ event handler, we do this at the end of handling the event:

> if (refcount_dec_and_test(&srq->refcnt))
>         wake_up(&srq->wait);

If we wait on refcnt == 1, there isn't an atomic way to decrement and
check if the result is 1. If it's fine to wake up the process while
the condition hasn't been met, this would work, but that doesn't seem
to be the right solution here, unless I'm missing something.

Bryan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe Dec. 11, 2017, 10:14 p.m. UTC | #5
On Mon, Dec 11, 2017 at 01:59:37PM -0800, Bryan Tan wrote:
> On Mon, Dec 11, 2017 at 01:14:21PM -0700, Jason Gunthorpe wrote:
> > On Mon, Dec 11, 2017 at 10:33:15AM -0800, Bryan Tan wrote:
> > > On Fri, Dec 08, 2017 at 04:30:49PM -0700, Jason Gunthorpe wrote:
> > > > On Fri, Dec 08, 2017 at 11:02:24AM -0800, Bryan Tan wrote:
> > > > > refcount_dec generates a warning when the operation
> > > > > causes the refcount to hit zero. Avoid this by using
> > > > > refcount_dec_and_test.
> > > > > 
> > > > > Fixes: 8b10ba783c9d ("RDMA/vmw_pvrdma: Add shared receive queue support")
> > > > > Reviewed-by: Adit Ranadive <aditr@vmware.com>
> > > > > Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
> > > > > Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
> > > > > Signed-off-by: Bryan Tan <bryantan@vmware.com>
> > > > >  drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c | 4 ++--
> > > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c
> > > > > index 826ccb8..a2b1a3c 100644
> > > > > +++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c
> > > > > @@ -236,8 +236,8 @@ static void pvrdma_free_srq(struct pvrdma_dev *dev, struct pvrdma_srq *srq)
> > > > >  	dev->srq_tbl[srq->srq_handle] = NULL;
> > > > >  	spin_unlock_irqrestore(&dev->srq_tbl_lock, flags);
> > > > >  
> > > > > -	refcount_dec(&srq->refcnt);
> > > > > -	wait_event(srq->wait, !refcount_read(&srq->refcnt));
> > > > > +	if (!refcount_dec_and_test(&srq->refcnt))
> > > > > +		wait_event(srq->wait, !refcount_read(&srq->refcnt));
> > > > 
> > > > I really don't like this idiom for using refcount, refcount should
> > > > not be dec'd below 0 even under a dec_and_test..
> > > > Why not just simplify:
> > > >
> > > > 	wait_event(srq->wait, refcount_read(&srq->refcnt) == 1);
> > > >
> > > > ???
> > > >
> > > > Same comment on the other patch switching from atomic to refcount
> > > > 
> > > > Jason
> > > 
> > > The refcount doesn't go below 0 in either case--the problem is that
> > > I didn't realise refcount_dec complains about decrements resulting
> > > in a value of 0 [1]. There are no problems with refcount_dec_and_test
> > > resulting in a value of 0.
> > 
> > Thats what I ment with my remark.. Sorry for the confusion
> > 
> > > About checking for refcnt == 1, I do not know of a safe way to only
> > > wake up when the refcount hits 1. Right now we do that by checking
> > > for 0 and doing a wake_up if the result of refcount_dec_and_test is
> > > 0 (see the SRQ event handler in pvrdma_main.c). If there's a way to
> > > accomplish this without another dec_and_test in the destroy path, I
> > > can do so.
> > 
> > What is wrong with this:
> > 
> >  	wait_event(srq->wait, refcount_read(&srq->refcnt) == 1);
> > ?
> > 
> > refcount == 1 means the caller is the last owner of the refcount,
> > presumably you have already taken care to ensure it cannot be inc'd
> > again (or the code is already not locked right)
> 
> In the SRQ event handler, we do this at the end of handling the event:
> 
> > if (refcount_dec_and_test(&srq->refcnt))
> >         wake_up(&srq->wait);

What? You can't combine one thread doing

 if (refcount_dec_and_test(&srq->refcnt))
         wake_up(&srq->wait);

With
  if (!refcount_dec_and_test(&srq->refcnt))
	wait_event(srq->wait, !refcount_read(&srq->refcnt));

It makes no sense, how is that a refcount???

They can't *BOTH* refcount_dec_and_test!

I can understand doing

  if (refcount_dec_and_test(&srq->refcnt))
         wake_up(&srq->wait);

and then

   wait_event(srq->wait, !refcount_read(&srq->refcnt));

That makes perfect sense.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bryan Tan Dec. 12, 2017, 5:13 p.m. UTC | #6
On Mon, Dec 11, 2017 at 03:14:25PM -0700, Jason Gunthorpe wrote:
> On Mon, Dec 11, 2017 at 01:59:37PM -0800, Bryan Tan wrote:
> > On Mon, Dec 11, 2017 at 01:14:21PM -0700, Jason Gunthorpe wrote:
> > > On Mon, Dec 11, 2017 at 10:33:15AM -0800, Bryan Tan wrote:
> > > > On Fri, Dec 08, 2017 at 04:30:49PM -0700, Jason Gunthorpe wrote:
> > > > > On Fri, Dec 08, 2017 at 11:02:24AM -0800, Bryan Tan wrote:
> > > > > > refcount_dec generates a warning when the operation
> > > > > > causes the refcount to hit zero. Avoid this by using
> > > > > > refcount_dec_and_test.
> > > > > > 
> > > > > > Fixes: 8b10ba783c9d ("RDMA/vmw_pvrdma: Add shared receive queue support")
> > > > > > Reviewed-by: Adit Ranadive <aditr@vmware.com>
> > > > > > Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
> > > > > > Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
> > > > > > Signed-off-by: Bryan Tan <bryantan@vmware.com>
> > > > > >  drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c | 4 ++--
> > > > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c
> > > > > > index 826ccb8..a2b1a3c 100644
> > > > > > +++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c
> > > > > > @@ -236,8 +236,8 @@ static void pvrdma_free_srq(struct pvrdma_dev *dev, struct pvrdma_srq *srq)
> > > > > >  	dev->srq_tbl[srq->srq_handle] = NULL;
> > > > > >  	spin_unlock_irqrestore(&dev->srq_tbl_lock, flags);
> > > > > >  
> > > > > > -	refcount_dec(&srq->refcnt);
> > > > > > -	wait_event(srq->wait, !refcount_read(&srq->refcnt));
> > > > > > +	if (!refcount_dec_and_test(&srq->refcnt))
> > > > > > +		wait_event(srq->wait, !refcount_read(&srq->refcnt));
> > > > > 
> > > > > I really don't like this idiom for using refcount, refcount should
> > > > > not be dec'd below 0 even under a dec_and_test..
> > > > > Why not just simplify:
> > > > >
> > > > > 	wait_event(srq->wait, refcount_read(&srq->refcnt) == 1);
> > > > >
> > > > > ???
> > > > >
> > > > > Same comment on the other patch switching from atomic to refcount
> > > > > 
> > > > > Jason
> > > > 
> > > > The refcount doesn't go below 0 in either case--the problem is that
> > > > I didn't realise refcount_dec complains about decrements resulting
> > > > in a value of 0 [1]. There are no problems with refcount_dec_and_test
> > > > resulting in a value of 0.
> > > 
> > > Thats what I ment with my remark.. Sorry for the confusion
> > > 
> > > > About checking for refcnt == 1, I do not know of a safe way to only
> > > > wake up when the refcount hits 1. Right now we do that by checking
> > > > for 0 and doing a wake_up if the result of refcount_dec_and_test is
> > > > 0 (see the SRQ event handler in pvrdma_main.c). If there's a way to
> > > > accomplish this without another dec_and_test in the destroy path, I
> > > > can do so.
> > > 
> > > What is wrong with this:
> > > 
> > >  	wait_event(srq->wait, refcount_read(&srq->refcnt) == 1);
> > > ?
> > > 
> > > refcount == 1 means the caller is the last owner of the refcount,
> > > presumably you have already taken care to ensure it cannot be inc'd
> > > again (or the code is already not locked right)
> > 
> > In the SRQ event handler, we do this at the end of handling the event:
> > 
> > > if (refcount_dec_and_test(&srq->refcnt))
> > >         wake_up(&srq->wait);
> 
> What? You can't combine one thread doing
> 
>  if (refcount_dec_and_test(&srq->refcnt))
>          wake_up(&srq->wait);
> 
> With
>   if (!refcount_dec_and_test(&srq->refcnt))
> 	wait_event(srq->wait, !refcount_read(&srq->refcnt));
> 
> It makes no sense, how is that a refcount???
> 
> They can't *BOTH* refcount_dec_and_test!
>
> I can understand doing
> 
>   if (refcount_dec_and_test(&srq->refcnt))
>          wake_up(&srq->wait);
> 
> and then
> 
>    wait_event(srq->wait, !refcount_read(&srq->refcnt));
> 
> That makes perfect sense.

Perhaps I am mistaken in my understanding of refcounts here, but
what is wrong with this? If it's not that the refcount is set to
1 on resource creation (which, by your earlier suggestion of
checking refcnt == 1, seems to be fine), and both callers (the
resource event handler and resource destroy call) need to make
sure the destroy sequence only happens once (when the refcount
reaches 0), then using refcount_dec_and_test seems right to me.

Your last suggestion would mean setting the refcount to 0 on
resource creation, as we wait on refcount hitting 0, but that
would mean the resource event handler would often be calling
wake_up unnecessarily.

What we're doing is effectively the same as how mlx5 handles
this (see drivers/net/ethernet/mellanox/mlx5/core/srq.c) I
suppose using refcount_t requires one to follow a particular
model of usage but I'm not sure how this isn't right.

Thanks,
Bryan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe Dec. 12, 2017, 6:55 p.m. UTC | #7
> > What? You can't combine one thread doing
> > 
> >  if (refcount_dec_and_test(&srq->refcnt))
> >          wake_up(&srq->wait);
> > 
> > With
> >   if (!refcount_dec_and_test(&srq->refcnt))
> > 	wait_event(srq->wait, !refcount_read(&srq->refcnt));
> > 
> > It makes no sense, how is that a refcount???
> > 
> > They can't *BOTH* refcount_dec_and_test!
> >
> > I can understand doing
> >
> >   if (refcount_dec_and_test(&srq->refcnt))
> >          wake_up(&srq->wait);
> >
> > and then
> > 
> >    wait_event(srq->wait, !refcount_read(&srq->refcnt));
> > 
> > That makes perfect sense.
> 
> Perhaps I am mistaken in my understanding of refcounts here, but
> what is wrong with this? If it's not that the refcount is set to
> 1 on resource creation (which, by your earlier suggestion of
> checking refcnt == 1, seems to be fine), and both callers (the
> resource event handler and resource destroy call) need to make
> sure the destroy sequence only happens once (when the refcount
> reaches 0), then using refcount_dec_and_test seems right to me.

I think the issue here is trying to fit an optimized approach that was
using atomics into a refcount.

I have certain expectations when I see something called 'refcount'
that this scheme really doesn't follow anymore, and those expectations
are pretty much matched by the protections inside refcount against
going to 0 and so on.

The algorithm works OK, but I'm not sure it is a 'refcount', more of a
'usecnt'..

On the other hand, having the refcount overflow protections in this
path does seem useful.

I did a random audit of refcount_dec_and_test existing users and
didn't find any that use this pattern.

What you've actually built here is a rwsem out of an atomic and a wait
queue..

So, what is wrong with a rwsem?

> What we're doing is effectively the same as how mlx5 handles
> this (see drivers/net/ethernet/mellanox/mlx5/core/srq.c) I
> suppose using refcount_t requires one to follow a particular
> model of usage but I'm not sure how this isn't right.

I didn't say it was wrong, I asked 'how is that a refcount??'

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bryan Tan Dec. 13, 2017, 12:08 a.m. UTC | #8
On Tue, Dec 12, 2017 at 11:55:30AM -0700, Jason Gunthorpe wrote:
> > > What? You can't combine one thread doing
> > > 
> > >  if (refcount_dec_and_test(&srq->refcnt))
> > >          wake_up(&srq->wait);
> > > 
> > > With
> > >   if (!refcount_dec_and_test(&srq->refcnt))
> > > 	wait_event(srq->wait, !refcount_read(&srq->refcnt));
> > > 
> > > It makes no sense, how is that a refcount???
> > > 
> > > They can't *BOTH* refcount_dec_and_test!
> > >
> > > I can understand doing
> > >
> > >   if (refcount_dec_and_test(&srq->refcnt))
> > >          wake_up(&srq->wait);
> > >
> > > and then
> > > 
> > >    wait_event(srq->wait, !refcount_read(&srq->refcnt));
> > > 
> > > That makes perfect sense.
> > 
> > Perhaps I am mistaken in my understanding of refcounts here, but
> > what is wrong with this? If it's not that the refcount is set to
> > 1 on resource creation (which, by your earlier suggestion of
> > checking refcnt == 1, seems to be fine), and both callers (the
> > resource event handler and resource destroy call) need to make
> > sure the destroy sequence only happens once (when the refcount
> > reaches 0), then using refcount_dec_and_test seems right to me.
> 
> I think the issue here is trying to fit an optimized approach that was
> using atomics into a refcount.
> 
> I have certain expectations when I see something called 'refcount'
> that this scheme really doesn't follow anymore, and those expectations
> are pretty much matched by the protections inside refcount against
> going to 0 and so on.
> 
> The algorithm works OK, but I'm not sure it is a 'refcount', more of a
> 'usecnt'..

Hm, it's still not clear to me what expectations are not followed here,
and how you're differentiating 'refcount' from 'usecnt'. And not that
it makes it right, but I did notice that mlx4 uses the same pattern as
what I've done here [1].

If it makes the most sense to revert to atomic operations or use a
rwsem, I can do that. Just trying to understand what the expectations
are for using refcount_t.

Bryan

[1] Commit ff61b5e3f041c2f1aa8d7c700af3007889973889
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe Dec. 13, 2017, 2:20 a.m. UTC | #9
On Tue, Dec 12, 2017 at 04:08:16PM -0800, Bryan Tan wrote:
> On Tue, Dec 12, 2017 at 11:55:30AM -0700, Jason Gunthorpe wrote:

> Hm, it's still not clear to me what expectations are not followed here,
> and how you're differentiating 'refcount' from 'usecnt'.

A refcount is something that has exactly one 'refcount_dec_and_test'
that then goes on to 'free' the thing being refcountedt. It never has
a wait queue.

A usecnt is something that has an async path that waits for all users
to stop using it then 'frees' it. It always includes a wait
queue.

> And not that it makes it right, but I did notice that mlx4 uses the
> same pattern as what I've done here [1].

Ah, this is good, I was trying to find something like it. Kees
approves so I have no problem taking your patch.

Though you might want to reflect on if using a completion is better
than a wait_event.. Not immediately clear to me.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bryan Tan Dec. 13, 2017, 9:07 a.m. UTC | #10
On Tue, Dec 12, 2017 at 07:20:31PM -0700, Jason Gunthorpe wrote:
> On Tue, Dec 12, 2017 at 04:08:16PM -0800, Bryan Tan wrote:
> > On Tue, Dec 12, 2017 at 11:55:30AM -0700, Jason Gunthorpe wrote:
> 
> > Hm, it's still not clear to me what expectations are not followed here,
> > and how you're differentiating 'refcount' from 'usecnt'.
> 
> A refcount is something that has exactly one 'refcount_dec_and_test'
> that then goes on to 'free' the thing being refcountedt. It never has
> a wait queue.
> 
> A usecnt is something that has an async path that waits for all users
> to stop using it then 'frees' it. It always includes a wait
> queue.

I see, thanks for the explanation.

> > And not that it makes it right, but I did notice that mlx4 uses the
> > same pattern as what I've done here [1].
> 
> Ah, this is good, I was trying to find something like it. Kees
> approves so I have no problem taking your patch.
> 
> Though you might want to reflect on if using a completion is better
> than a wait_event.. Not immediately clear to me.

Oof, thanks for pointing that out. There's a problem here it seems:
0. refcount for a resource is 2 after entering an event handler
1. refcount_dec_and_test gets called by the resource destroy
2. refcount_dec_and_test gets called by the resource event handler
3. resource destroy call checks the condition in wait_event and
   proceeds to free the resource
4. resource event handler calls wake_up() on free'd resource

I'll use a completion here instead.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe Dec. 13, 2017, 4:12 p.m. UTC | #11
On Wed, Dec 13, 2017 at 01:07:02AM -0800, Bryan Tan wrote:

> > > And not that it makes it right, but I did notice that mlx4 uses the
> > > same pattern as what I've done here [1].
> > 
> > Ah, this is good, I was trying to find something like it. Kees
> > approves so I have no problem taking your patch.
> > 
> > Though you might want to reflect on if using a completion is better
> > than a wait_event.. Not immediately clear to me.
> 
> Oof, thanks for pointing that out. There's a problem here it seems:
> 0. refcount for a resource is 2 after entering an event handler
> 1. refcount_dec_and_test gets called by the resource destroy
> 2. refcount_dec_and_test gets called by the resource event handler
> 3. resource destroy call checks the condition in wait_event and
>    proceeds to free the resource
> 4. resource event handler calls wake_up() on free'd resource

Yes I agree. I knew it looked funny :)

Please resend

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c
index 826ccb8..a2b1a3c 100644
--- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c
+++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c
@@ -236,8 +236,8 @@  static void pvrdma_free_srq(struct pvrdma_dev *dev, struct pvrdma_srq *srq)
 	dev->srq_tbl[srq->srq_handle] = NULL;
 	spin_unlock_irqrestore(&dev->srq_tbl_lock, flags);
 
-	refcount_dec(&srq->refcnt);
-	wait_event(srq->wait, !refcount_read(&srq->refcnt));
+	if (!refcount_dec_and_test(&srq->refcnt))
+		wait_event(srq->wait, !refcount_read(&srq->refcnt));
 
 	/* There is no support for kernel clients, so this is safe. */
 	ib_umem_release(srq->umem);