diff mbox

[v1,for-rc] RDMA/vmw_pvrdma: Report CQ missed events

Message ID 1502391902-5674-1-git-send-email-aditr@vmware.com (mailing list archive)
State Accepted
Headers show

Commit Message

Adit Ranadive Aug. 10, 2017, 7:05 p.m. UTC
From: Bryan Tan <bryantan@vmware.com>

There is a chance of a race between arming the CQ and receiving
completions. By reporting CQ missed events any ULPs should poll
again to get the completions.

Fixes: 29c8d9eba550 ("IB: Add vmw_pvrdma driver")
Acked-by: Aditya Sarwade <asarwade@vmware.com>
Signed-off-by: Bryan Tan <bryantan@vmware.com>
Signed-off-by: Adit Ranadive <aditr@vmware.com>
---
v0 -> v1:
 - Check for invalid ring index.
---
 drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

Comments

Yuval Shaia Aug. 10, 2017, 8:02 p.m. UTC | #1
On Thu, Aug 10, 2017 at 12:05:02PM -0700, Adit Ranadive wrote:
> From: Bryan Tan <bryantan@vmware.com>
> 
> There is a chance of a race between arming the CQ and receiving
> completions. By reporting CQ missed events any ULPs should poll
> again to get the completions.
> 
> Fixes: 29c8d9eba550 ("IB: Add vmw_pvrdma driver")
> Acked-by: Aditya Sarwade <asarwade@vmware.com>
> Signed-off-by: Bryan Tan <bryantan@vmware.com>
> Signed-off-by: Adit Ranadive <aditr@vmware.com>
> ---
> v0 -> v1:
>  - Check for invalid ring index.
> ---
>  drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c
> index 69bda61..90aa326 100644
> --- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c
> +++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c
> @@ -65,13 +65,28 @@ int pvrdma_req_notify_cq(struct ib_cq *ibcq,
>  	struct pvrdma_dev *dev = to_vdev(ibcq->device);
>  	struct pvrdma_cq *cq = to_vcq(ibcq);
>  	u32 val = cq->cq_handle;
> +	unsigned long flags;
> +	int has_data = 0;
>  
>  	val |= (notify_flags & IB_CQ_SOLICITED_MASK) == IB_CQ_SOLICITED ?
>  		PVRDMA_UAR_CQ_ARM_SOL : PVRDMA_UAR_CQ_ARM;
>  
> +	spin_lock_irqsave(&cq->cq_lock, flags);
> +
>  	pvrdma_write_uar_cq(dev, val);
>  
> -	return 0;
> +	if (notify_flags & IB_CQ_REPORT_MISSED_EVENTS) {
> +		unsigned int head;
> +
> +		has_data = pvrdma_idx_ring_has_data(&cq->ring_state->rx,
> +						    cq->ibcq.cqe, &head);
> +		if (unlikely(has_data == PVRDMA_INVALID_IDX))
> +			dev_err(&dev->pdev->dev, "CQ ring state invalid\n");

I see the point of checking the return value but per my understanding, and
correct me if i'm wrong, this rare case points to a corrupted ring which
can happen *only* in case of a bug so it is not "error" by nature.
If this is correct then i don't see the point of having this "question" on
every call to ib_notify_cq.

Do you agree to move this check to pvrdma_idx_ring_has_data and even make
the function use BUG_ON?

> +	}
> +
> +	spin_unlock_irqrestore(&cq->cq_lock, flags);
> +
> +	return has_data;
>  }
>  
>  /**
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Adit Ranadive Aug. 10, 2017, 8:55 p.m. UTC | #2
T24gOC8xMC8xNywgMTowMiBQTSwgIll1dmFsIFNoYWlhIiA8eXV2YWwuc2hhaWFAb3JhY2xlLmNv
bT4gd3JvdGU6DQo+IE9uIFRodSwgQXVnIDEwLCAyMDE3IGF0IDEyOjA1OjAyUE0gLTA3MDAsIEFk
aXQgUmFuYWRpdmUgd3JvdGU6DQo+ID4gRnJvbTogQnJ5YW4gVGFuIDxicnlhbnRhbkB2bXdhcmUu
Y29tPg0KPiA+IA0KPiA+IFRoZXJlIGlzIGEgY2hhbmNlIG9mIGEgcmFjZSBiZXR3ZWVuIGFybWlu
ZyB0aGUgQ1EgYW5kIHJlY2VpdmluZw0KPiA+IGNvbXBsZXRpb25zLiBCeSByZXBvcnRpbmcgQ1Eg
bWlzc2VkIGV2ZW50cyBhbnkgVUxQcyBzaG91bGQgcG9sbA0KPiA+IGFnYWluIHRvIGdldCB0aGUg
Y29tcGxldGlvbnMuDQo+ID4gDQo+ID4gRml4ZXM6IDI5YzhkOWViYTU1MCAoIklCOiBBZGQgdm13
X3B2cmRtYSBkcml2ZXIiKQ0KPiA+IEFja2VkLWJ5OiBBZGl0eWEgU2Fyd2FkZSA8YXNhcndhZGVA
dm13YXJlLmNvbT4NCj4gPiBTaWduZWQtb2ZmLWJ5OiBCcnlhbiBUYW4gPGJyeWFudGFuQHZtd2Fy
ZS5jb20+DQo+ID4gU2lnbmVkLW9mZi1ieTogQWRpdCBSYW5hZGl2ZSA8YWRpdHJAdm13YXJlLmNv
bT4NCj4gPiAtLS0NCj4gPiB2MCAtPiB2MToNCj4gPiAgLSBDaGVjayBmb3IgaW52YWxpZCByaW5n
IGluZGV4Lg0KPiA+IC0tLQ0KPiA+ICBkcml2ZXJzL2luZmluaWJhbmQvaHcvdm13X3B2cmRtYS9w
dnJkbWFfY3EuYyB8IDE3ICsrKysrKysrKysrKysrKystDQo+ID4gIDEgZmlsZSBjaGFuZ2VkLCAx
NiBpbnNlcnRpb25zKCspLCAxIGRlbGV0aW9uKC0pDQo+ID4gDQo+ID4gZGlmZiAtLWdpdCBhL2Ry
aXZlcnMvaW5maW5pYmFuZC9ody92bXdfcHZyZG1hL3B2cmRtYV9jcS5jIGIvZHJpdmVycy9pbmZp
bmliYW5kL2h3L3Ztd19wdnJkbWEvcHZyZG1hX2NxLmMNCj4gPiBpbmRleCA2OWJkYTYxLi45MGFh
MzI2IDEwMDY0NA0KPiA+IC0tLSBhL2RyaXZlcnMvaW5maW5pYmFuZC9ody92bXdfcHZyZG1hL3B2
cmRtYV9jcS5jDQo+ID4gKysrIGIvZHJpdmVycy9pbmZpbmliYW5kL2h3L3Ztd19wdnJkbWEvcHZy
ZG1hX2NxLmMNCj4gPiBAQCAtNjUsMTMgKzY1LDI4IEBAIGludCBwdnJkbWFfcmVxX25vdGlmeV9j
cShzdHJ1Y3QgaWJfY3EgKmliY3EsDQo+ID4gIAlzdHJ1Y3QgcHZyZG1hX2RldiAqZGV2ID0gdG9f
dmRldihpYmNxLT5kZXZpY2UpOw0KPiA+ICAJc3RydWN0IHB2cmRtYV9jcSAqY3EgPSB0b192Y3Eo
aWJjcSk7DQo+ID4gIAl1MzIgdmFsID0gY3EtPmNxX2hhbmRsZTsNCj4gPiArCXVuc2lnbmVkIGxv
bmcgZmxhZ3M7DQo+ID4gKwlpbnQgaGFzX2RhdGEgPSAwOw0KPiA+ICANCj4gPiAgCXZhbCB8PSAo
bm90aWZ5X2ZsYWdzICYgSUJfQ1FfU09MSUNJVEVEX01BU0spID09IElCX0NRX1NPTElDSVRFRCA/
DQo+ID4gIAkJUFZSRE1BX1VBUl9DUV9BUk1fU09MIDogUFZSRE1BX1VBUl9DUV9BUk07DQo+ID4g
IA0KPiA+ICsJc3Bpbl9sb2NrX2lycXNhdmUoJmNxLT5jcV9sb2NrLCBmbGFncyk7DQo+ID4gKw0K
PiA+ICAJcHZyZG1hX3dyaXRlX3Vhcl9jcShkZXYsIHZhbCk7DQo+ID4gIA0KPiA+IC0JcmV0dXJu
IDA7DQo+ID4gKwlpZiAobm90aWZ5X2ZsYWdzICYgSUJfQ1FfUkVQT1JUX01JU1NFRF9FVkVOVFMp
IHsNCj4gPiArCQl1bnNpZ25lZCBpbnQgaGVhZDsNCj4gPiArDQo+ID4gKwkJaGFzX2RhdGEgPSBw
dnJkbWFfaWR4X3JpbmdfaGFzX2RhdGEoJmNxLT5yaW5nX3N0YXRlLT5yeCwNCj4gPiArCQkJCQkJ
ICAgIGNxLT5pYmNxLmNxZSwgJmhlYWQpOw0KPiA+ICsJCWlmICh1bmxpa2VseShoYXNfZGF0YSA9
PSBQVlJETUFfSU5WQUxJRF9JRFgpKQ0KPiA+ICsJCQlkZXZfZXJyKCZkZXYtPnBkZXYtPmRldiwg
IkNRIHJpbmcgc3RhdGUgaW52YWxpZFxuIik7DQo+IA0KPiBJIHNlZSB0aGUgcG9pbnQgb2YgY2hl
Y2tpbmcgdGhlIHJldHVybiB2YWx1ZSBidXQgcGVyIG15IHVuZGVyc3RhbmRpbmcsIGFuZA0KPiBj
b3JyZWN0IG1lIGlmIGknbSB3cm9uZywgdGhpcyByYXJlIGNhc2UgcG9pbnRzIHRvIGEgY29ycnVw
dGVkIHJpbmcgd2hpY2gNCj4gY2FuIGhhcHBlbiAqb25seSogaW4gY2FzZSBvZiBhIGJ1ZyBzbyBp
dCBpcyBub3QgImVycm9yIiBieSBuYXR1cmUuDQo+IElmIHRoaXMgaXMgY29ycmVjdCB0aGVuIGkg
ZG9uJ3Qgc2VlIHRoZSBwb2ludCBvZiBoYXZpbmcgdGhpcyAicXVlc3Rpb24iIG9uDQo+IGV2ZXJ5
IGNhbGwgdG8gaWJfbm90aWZ5X2NxLg0KPiANCj4gRG8geW91IGFncmVlIHRvIG1vdmUgdGhpcyBj
aGVjayB0byBwdnJkbWFfaWR4X3JpbmdfaGFzX2RhdGEgYW5kIGV2ZW4gbWFrZQ0KPiB0aGUgZnVu
Y3Rpb24gdXNlIEJVR19PTj8NCg0KSSdsbCBjb25jZWRlIHRoYXQgd2hpbGUgaXQgcG9pbnRzIHRv
IGEgY29ycnVwdGVkIHJpbmcgKHRocm91Z2ggYSBkZXZpY2UgYnVnLCANCm1lbW9yeSBjb3JydXB0
aW9uKSBidXQgd2Ugd2FudCB0byByZXBvcnQgaXQgYXMgYSBkZXZpY2UgZXJyb3IgdG8gbWFpbnRh
aW4NCmNvbnNpc3RlbmN5IGluIG91ciBkcml2ZXIgYW5kIGdpdmUgVUxQcyBhIGNoYW5jZSB0byBj
bGVhbiB1cC4gQWxzbywgdGhlIGNvbXBpbGVyDQpvcHRpbWl6YXRpb24gc2hvdWxkIGhlbHAgaGVy
ZS4NCg0K
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yuval Shaia Aug. 10, 2017, 9:01 p.m. UTC | #3
On Thu, Aug 10, 2017 at 08:55:42PM +0000, Adit Ranadive wrote:
> On 8/10/17, 1:02 PM, "Yuval Shaia" <yuval.shaia@oracle.com> wrote:
> > On Thu, Aug 10, 2017 at 12:05:02PM -0700, Adit Ranadive wrote:
> > > From: Bryan Tan <bryantan@vmware.com>
> > > 
> > > There is a chance of a race between arming the CQ and receiving
> > > completions. By reporting CQ missed events any ULPs should poll
> > > again to get the completions.
> > > 
> > > Fixes: 29c8d9eba550 ("IB: Add vmw_pvrdma driver")
> > > Acked-by: Aditya Sarwade <asarwade@vmware.com>
> > > Signed-off-by: Bryan Tan <bryantan@vmware.com>
> > > Signed-off-by: Adit Ranadive <aditr@vmware.com>
> > > ---
> > > v0 -> v1:
> > >  - Check for invalid ring index.
> > > ---
> > >  drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c | 17 ++++++++++++++++-
> > >  1 file changed, 16 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c
> > > index 69bda61..90aa326 100644
> > > --- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c
> > > +++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c
> > > @@ -65,13 +65,28 @@ int pvrdma_req_notify_cq(struct ib_cq *ibcq,
> > >  	struct pvrdma_dev *dev = to_vdev(ibcq->device);
> > >  	struct pvrdma_cq *cq = to_vcq(ibcq);
> > >  	u32 val = cq->cq_handle;
> > > +	unsigned long flags;
> > > +	int has_data = 0;
> > >  
> > >  	val |= (notify_flags & IB_CQ_SOLICITED_MASK) == IB_CQ_SOLICITED ?
> > >  		PVRDMA_UAR_CQ_ARM_SOL : PVRDMA_UAR_CQ_ARM;
> > >  
> > > +	spin_lock_irqsave(&cq->cq_lock, flags);
> > > +
> > >  	pvrdma_write_uar_cq(dev, val);
> > >  
> > > -	return 0;
> > > +	if (notify_flags & IB_CQ_REPORT_MISSED_EVENTS) {
> > > +		unsigned int head;
> > > +
> > > +		has_data = pvrdma_idx_ring_has_data(&cq->ring_state->rx,
> > > +						    cq->ibcq.cqe, &head);
> > > +		if (unlikely(has_data == PVRDMA_INVALID_IDX))
> > > +			dev_err(&dev->pdev->dev, "CQ ring state invalid\n");
> > 
> > I see the point of checking the return value but per my understanding, and
> > correct me if i'm wrong, this rare case points to a corrupted ring which
> > can happen *only* in case of a bug so it is not "error" by nature.
> > If this is correct then i don't see the point of having this "question" on
> > every call to ib_notify_cq.
> > 
> > Do you agree to move this check to pvrdma_idx_ring_has_data and even make
> > the function use BUG_ON?
> 
> I'll concede that while it points to a corrupted ring (through a device bug, 
> memory corruption) but we want to report it as a device error to maintain
> consistency in our driver and give ULPs a chance to clean up. Also, the compiler
> optimization should help here.

Great, i understand that.
So, at least can you consider moving this dev_err into
pvrdma_idx_ring_has_data so callers do not need to handle errors?

btw, same apply to pvrdma_idx_ring_has_space

> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Adit Ranadive Aug. 10, 2017, 9:26 p.m. UTC | #4
On 8/10/17, 2:01 PM, "Yuval Shaia" <yuval.shaia@oracle.com> wrote:
> Great, i understand that.

> So, at least can you consider moving this dev_err into

> pvrdma_idx_ring_has_data so callers do not need to handle errors?


Thanks for the suggestion. It's something we will have to discuss internally,
though I'm not sure if BUG_ON is the right way to go. That just seems a really
big hammer to me.
As it stands the patch should be added since it does fix a potential race
condition.
Yuval Shaia Aug. 10, 2017, 9:33 p.m. UTC | #5
On Thu, Aug 10, 2017 at 09:26:06PM +0000, Adit Ranadive wrote:
> On 8/10/17, 2:01 PM, "Yuval Shaia" <yuval.shaia@oracle.com> wrote:
> > Great, i understand that.
> > So, at least can you consider moving this dev_err into
> > pvrdma_idx_ring_has_data so callers do not need to handle errors?
> 
> Thanks for the suggestion. It's something we will have to discuss internally,
> though I'm not sure if BUG_ON is the right way to go. That just seems a really

No, i agree, BUG_ON is a mistake as ring corruption can be caused by HW, my
suggestion is just to place the check and error print inside the function
pvrdma_idx_ring_has_data so callers will not need to handle such errors.

Anyways, with or without taking the suggestion - fix lgtm.

Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>

> big hammer to me.
> As it stands the patch should be added since it does fix a potential race
> condition.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Doug Ledford Aug. 16, 2017, 2:59 p.m. UTC | #6
On Fri, 2017-08-11 at 00:33 +0300, Yuval Shaia wrote:
> On Thu, Aug 10, 2017 at 09:26:06PM +0000, Adit Ranadive wrote:
> > On 8/10/17, 2:01 PM, "Yuval Shaia" <yuval.shaia@oracle.com> wrote:
> > > Great, i understand that.
> > > So, at least can you consider moving this dev_err into
> > > pvrdma_idx_ring_has_data so callers do not need to handle errors?
> > 
> > Thanks for the suggestion. It's something we will have to discuss
> > internally,
> > though I'm not sure if BUG_ON is the right way to go. That just
> > seems a really
> 
> No, i agree, BUG_ON is a mistake as ring corruption can be caused by
> HW, 

BUG_ON is generally a mistake regardless.  Even if it was an issue in
the driver, another option would be to forcibly remove the ib_device
and shut it all down.  That would prevent a driver bug from taking the
machine down, and therefore is preferable to a BUG_ON.  If I see a
BUG_ON, I'm immediately going to be asking why that particular
condition is so bad that you think someone's Oracle server should die
entirely versus limping along with some disabled device or something
(because I know that's exactly what Linus will ask when he sees it).

> my
> suggestion is just to place the check and error print inside the
> function
> pvrdma_idx_ring_has_data so callers will not need to handle such
> errors.
> 
> Anyways, with or without taking the suggestion - fix lgtm.
> 
> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>

Pulled in for -rc.  Thanks.

> > big hammer to me.
> > As it stands the patch should be added since it does fix a
> > potential race
> > condition.
> >
Steven Rostedt Aug. 30, 2017, 3:31 p.m. UTC | #7
On Wed, 16 Aug 2017 10:59:52 -0400
Doug Ledford <dledford@redhat.com> wrote:

> (because I know that's exactly what Linus will ask when he sees it).

  https://lwn.net/Articles/13183/

Never ever use BUG_ON(), unless you are ready to really defend it. And
be prepared to be yelled at by Linus. Just use WARN_ON() and exit the
routine nicely.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c
index 69bda61..90aa326 100644
--- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c
+++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c
@@ -65,13 +65,28 @@  int pvrdma_req_notify_cq(struct ib_cq *ibcq,
 	struct pvrdma_dev *dev = to_vdev(ibcq->device);
 	struct pvrdma_cq *cq = to_vcq(ibcq);
 	u32 val = cq->cq_handle;
+	unsigned long flags;
+	int has_data = 0;
 
 	val |= (notify_flags & IB_CQ_SOLICITED_MASK) == IB_CQ_SOLICITED ?
 		PVRDMA_UAR_CQ_ARM_SOL : PVRDMA_UAR_CQ_ARM;
 
+	spin_lock_irqsave(&cq->cq_lock, flags);
+
 	pvrdma_write_uar_cq(dev, val);
 
-	return 0;
+	if (notify_flags & IB_CQ_REPORT_MISSED_EVENTS) {
+		unsigned int head;
+
+		has_data = pvrdma_idx_ring_has_data(&cq->ring_state->rx,
+						    cq->ibcq.cqe, &head);
+		if (unlikely(has_data == PVRDMA_INVALID_IDX))
+			dev_err(&dev->pdev->dev, "CQ ring state invalid\n");
+	}
+
+	spin_unlock_irqrestore(&cq->cq_lock, flags);
+
+	return has_data;
 }
 
 /**