diff mbox

[rdma-rc,2/5] IB/rxe: Fix handling of erroneous WR

Message ID 5bd83de1-64d3-e5a9-1c58-cca52d89d64a@sandisk.com (mailing list archive)
State Not Applicable
Headers show

Commit Message

Bart Van Assche Dec. 13, 2016, 7:03 a.m. UTC
On 11/16/2016 09:39 AM, Leon Romanovsky wrote:
> @@ -745,13 +746,17 @@ int rxe_requester(void *arg)
>  	wqe->status = IB_WC_LOC_PROT_ERR;
>  	wqe->state = wqe_state_error;
>  
> -complete:
> -	if (qp_type(qp) != IB_QPT_RC) {
> -		while (rxe_completer(qp) == 0)
> -			;
> -	}
> -
> -	return 0;
> +	/*
> +	 * IBA Spec. Section 10.7.3.1 SIGNALED COMPLETIONS
> +	 * ---------8<---------8<-------------
> +	 * ...Note that if a completion error occurs, a Work Completion
> +	 * will always be generated, even if the signaling
> +	 * indicator requests an Unsignaled Completion.
> +	 * ---------8<---------8<-------------
> +	 */
> +	wqe->wr.send_flags |= IB_SEND_SIGNALED;
> +	__rxe_do_task(&qp->comp.task);
> +	return -EAGAIN;

Hello Leon and Yonatan,

Sorry for the late reply but I think setting IB_SEND_SIGNALED for WQE's
reporting a completion error is wrong. I noticed this because this change
conflicts with Doug's rxe branch on github. Have you considered to apply
the following (untested) patch instead of setting the IB_SEND_SIGNALED
flag?

                rxe_cq_post(qp->scq, &cqe, 0);

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Leon Romanovsky Dec. 13, 2016, 7:44 a.m. UTC | #1
On Tue, Dec 13, 2016 at 08:03:03AM +0100, Bart Van Assche wrote:
> On 11/16/2016 09:39 AM, Leon Romanovsky wrote:
> > @@ -745,13 +746,17 @@ int rxe_requester(void *arg)
> >  	wqe->status = IB_WC_LOC_PROT_ERR;
> >  	wqe->state = wqe_state_error;
> >
> > -complete:
> > -	if (qp_type(qp) != IB_QPT_RC) {
> > -		while (rxe_completer(qp) == 0)
> > -			;
> > -	}
> > -
> > -	return 0;
> > +	/*
> > +	 * IBA Spec. Section 10.7.3.1 SIGNALED COMPLETIONS
> > +	 * ---------8<---------8<-------------
> > +	 * ...Note that if a completion error occurs, a Work Completion
> > +	 * will always be generated, even if the signaling
> > +	 * indicator requests an Unsignaled Completion.
> > +	 * ---------8<---------8<-------------
> > +	 */
> > +	wqe->wr.send_flags |= IB_SEND_SIGNALED;
> > +	__rxe_do_task(&qp->comp.task);
> > +	return -EAGAIN;
>
> Hello Leon and Yonatan,
>
> Sorry for the late reply but I think setting IB_SEND_SIGNALED for WQE's
> reporting a completion error is wrong.

I'm not clear about it. I didn't find in spec what to do with IB_SEND_SIGNALED
flag in case of error.

According to spec:
	"C10-91: The CI shall generate a CQE when a Work Request
	completed under any of the following conditions:
	• The Work Request completed in error"


> I noticed this because this change
> conflicts with Doug's rxe branch on github. Have you considered to apply
> the following (untested) patch instead of setting the IB_SEND_SIGNALED
> flag?
>
> --- a/drivers/infiniband/sw/rxe/rxe_comp.c
> +++ b/drivers/infiniband/sw/rxe/rxe_comp.c
> @@ -418,7 +418,7 @@ static void do_complete(struct rxe_qp *qp, struct rxe_send_w
> qe *wqe)
>
>         if ((qp->sq_sig_type == IB_SIGNAL_ALL_WR) ||
>             (wqe->wr.send_flags & IB_SEND_SIGNALED) ||
> -           (qp->req.state == QP_STATE_ERROR)) {
> +           wqe->status != IB_WC_SUCCESS) {
>                 make_send_cqe(qp, wqe, &cqe);
>                 advance_consumer(qp->sq.queue);
>                 rxe_cq_post(qp->scq, &cqe, 0);
>
> Bart.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bart Van Assche Dec. 13, 2016, 8:04 a.m. UTC | #2
On 12/13/2016 08:44 AM, Leon Romanovsky wrote:
> On Tue, Dec 13, 2016 at 08:03:03AM +0100, Bart Van Assche wrote:
>> On 11/16/2016 09:39 AM, Leon Romanovsky wrote:
>>> @@ -745,13 +746,17 @@ int rxe_requester(void *arg)
>>>  	wqe->status = IB_WC_LOC_PROT_ERR;
>>>  	wqe->state = wqe_state_error;
>>>
>>> -complete:
>>> -	if (qp_type(qp) != IB_QPT_RC) {
>>> -		while (rxe_completer(qp) == 0)
>>> -			;
>>> -	}
>>> -
>>> -	return 0;
>>> +	/*
>>> +	 * IBA Spec. Section 10.7.3.1 SIGNALED COMPLETIONS
>>> +	 * ---------8<---------8<-------------
>>> +	 * ...Note that if a completion error occurs, a Work Completion
>>> +	 * will always be generated, even if the signaling
>>> +	 * indicator requests an Unsignaled Completion.
>>> +	 * ---------8<---------8<-------------
>>> +	 */
>>> +	wqe->wr.send_flags |= IB_SEND_SIGNALED;
>>> +	__rxe_do_task(&qp->comp.task);
>>> +	return -EAGAIN;
>>
>> Hello Leon and Yonatan,
>>
>> Sorry for the late reply but I think setting IB_SEND_SIGNALED for WQE's
>> reporting a completion error is wrong.
>
> I'm not clear about it. I didn't find in spec what to do with IB_SEND_SIGNALED
> flag in case of error.
>
> According to spec:
> 	"C10-91: The CI shall generate a CQE when a Work Request
> 	completed under any of the following conditions:
> 	• The Work Request completed in error"

Hello Leon,

To me that paragraph from the spec means that do_complete() is wrong. 
And once do_complete() is fixed, callers that set the WQE status to 
"error" no longer have to set IB_SEND_SIGNALED.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Leon Romanovsky Dec. 13, 2016, 12:20 p.m. UTC | #3
On Tue, Dec 13, 2016 at 09:04:44AM +0100, Bart Van Assche wrote:
> On 12/13/2016 08:44 AM, Leon Romanovsky wrote:
> > On Tue, Dec 13, 2016 at 08:03:03AM +0100, Bart Van Assche wrote:
> > > On 11/16/2016 09:39 AM, Leon Romanovsky wrote:
> > > > @@ -745,13 +746,17 @@ int rxe_requester(void *arg)
> > > >  	wqe->status = IB_WC_LOC_PROT_ERR;
> > > >  	wqe->state = wqe_state_error;
> > > >
> > > > -complete:
> > > > -	if (qp_type(qp) != IB_QPT_RC) {
> > > > -		while (rxe_completer(qp) == 0)
> > > > -			;
> > > > -	}
> > > > -
> > > > -	return 0;
> > > > +	/*
> > > > +	 * IBA Spec. Section 10.7.3.1 SIGNALED COMPLETIONS
> > > > +	 * ---------8<---------8<-------------
> > > > +	 * ...Note that if a completion error occurs, a Work Completion
> > > > +	 * will always be generated, even if the signaling
> > > > +	 * indicator requests an Unsignaled Completion.
> > > > +	 * ---------8<---------8<-------------
> > > > +	 */
> > > > +	wqe->wr.send_flags |= IB_SEND_SIGNALED;
> > > > +	__rxe_do_task(&qp->comp.task);
> > > > +	return -EAGAIN;
> > >
> > > Hello Leon and Yonatan,
> > >
> > > Sorry for the late reply but I think setting IB_SEND_SIGNALED for WQE's
> > > reporting a completion error is wrong.
> >
> > I'm not clear about it. I didn't find in spec what to do with IB_SEND_SIGNALED
> > flag in case of error.
> >
> > According to spec:
> > 	"C10-91: The CI shall generate a CQE when a Work Request
> > 	completed under any of the following conditions:
> > 	• The Work Request completed in error"
>
> Hello Leon,
>
> To me that paragraph from the spec means that do_complete() is wrong. And
> once do_complete() is fixed, callers that set the WQE status to "error" no
> longer have to set IB_SEND_SIGNALED.

Thanks,
It looks like a right thing to do. Yonatan is handling it.

>
> Bart.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- a/drivers/infiniband/sw/rxe/rxe_comp.c
+++ b/drivers/infiniband/sw/rxe/rxe_comp.c
@@ -418,7 +418,7 @@  static void do_complete(struct rxe_qp *qp, struct rxe_send_w
qe *wqe)
 
        if ((qp->sq_sig_type == IB_SIGNAL_ALL_WR) ||
            (wqe->wr.send_flags & IB_SEND_SIGNALED) ||
-           (qp->req.state == QP_STATE_ERROR)) {
+           wqe->status != IB_WC_SUCCESS) {
                make_send_cqe(qp, wqe, &cqe);
                advance_consumer(qp->sq.queue);