diff mbox

[1/3] IB: new common API for draining queues

Message ID 3e7261d1436d33320223d365974ff38945f0d558.1455230646.git.swise@chelsio.com (mailing list archive)
State Superseded
Headers show

Commit Message

Steve Wise Feb. 5, 2016, 9:13 p.m. UTC
From: Steve Wise <swise@opengridcomputing.com.com>

Add provider-specific drain_sq/drain_rq functions for providers needing
special drain logic.

Add static functions __ib_drain_sq() and __ib_drain_rq() which post noop
WRs to the SQ or RQ and block until their completions are processed.
This ensures the applications completions have all been processed.

Add API functions ib_drain_sq(), ib_drain_rq(), and ib_drain_qp().

Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
---
 drivers/infiniband/core/verbs.c | 144 ++++++++++++++++++++++++++++++++++++++++
 include/rdma/ib_verbs.h         |   5 ++
 2 files changed, 149 insertions(+)

Comments

Bart Van Assche Feb. 11, 2016, 10:59 p.m. UTC | #1
On 02/05/2016 01:13 PM, Steve Wise wrote:
> +static void wait_for_drain(struct ib_cq *cq, struct completion *c)
> +{
> +	if (cq->poll_ctx == IB_POLL_DIRECT)
> +		do
> +			ib_process_cq_direct(cq, 1024);
> +		while (!wait_for_completion_timeout(c, msecs_to_jiffies(100)));
> +	else
> +		wait_for_completion(c);
> +}

Hello Steve,

Have you verified this patch with checkpatch ? I expect that checkpatch 
will report that six braces are missing from this function.

Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bart Van Assche Feb. 11, 2016, 11:15 p.m. UTC | #2
On 02/11/2016 02:59 PM, Bart Van Assche wrote:
> On 02/05/2016 01:13 PM, Steve Wise wrote:
>> +static void wait_for_drain(struct ib_cq *cq, struct completion *c)
>> +{
>> +    if (cq->poll_ctx == IB_POLL_DIRECT)
>> +        do
>> +            ib_process_cq_direct(cq, 1024);
>> +        while (!wait_for_completion_timeout(c, msecs_to_jiffies(100)));
>> +    else
>> +        wait_for_completion(c);
>> +}
>
> Hello Steve,
>
> Have you verified this patch with checkpatch ? I expect that checkpatch
> will report that six braces are missing from this function.

(replying to my own e-mail)

BTW, this patch series does not call wait_for_drain() for any queue that 
uses IB_POLL_DIRECT. This means that the "if (cq->poll_ctx == 
IB_POLL_DIRECT)" part is untested. Please do propose untested code for 
upstream inclusion.

Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise Feb. 11, 2016, 11:18 p.m. UTC | #3
> -----Original Message-----
> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-owner@vger.kernel.org] On Behalf Of Bart Van Assche
> Sent: Thursday, February 11, 2016 4:59 PM
> To: Steve Wise; linux-rdma@vger.kernel.org
> Subject: Re: [PATCH 1/3] IB: new common API for draining queues
> 
> On 02/05/2016 01:13 PM, Steve Wise wrote:
> > +static void wait_for_drain(struct ib_cq *cq, struct completion *c)
> > +{
> > +	if (cq->poll_ctx == IB_POLL_DIRECT)
> > +		do
> > +			ib_process_cq_direct(cq, 1024);
> > +		while (!wait_for_completion_timeout(c, msecs_to_jiffies(100)));
> > +	else
> > +		wait_for_completion(c);
> > +}
> 
> Hello Steve,
> 
> Have you verified this patch with checkpatch ? I expect that checkpatch
> will report that six braces are missing from this function.
> 
> Thanks,
> 
> Bart.

I'll run it through.  Thanks.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise Feb. 11, 2016, 11:20 p.m. UTC | #4
> -----Original Message-----
> From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-owner@vger.kernel.org] On Behalf Of Bart Van Assche
> Sent: Thursday, February 11, 2016 5:16 PM
> To: Steve Wise; linux-rdma@vger.kernel.org
> Subject: Re: [PATCH 1/3] IB: new common API for draining queues
> 
> On 02/11/2016 02:59 PM, Bart Van Assche wrote:
> > On 02/05/2016 01:13 PM, Steve Wise wrote:
> >> +static void wait_for_drain(struct ib_cq *cq, struct completion *c)
> >> +{
> >> +    if (cq->poll_ctx == IB_POLL_DIRECT)
> >> +        do
> >> +            ib_process_cq_direct(cq, 1024);
> >> +        while (!wait_for_completion_timeout(c, msecs_to_jiffies(100)));
> >> +    else
> >> +        wait_for_completion(c);
> >> +}
> >
> > Hello Steve,
> >
> > Have you verified this patch with checkpatch ? I expect that checkpatch
> > will report that six braces are missing from this function.
> 
> (replying to my own e-mail)
> 
> BTW, this patch series does not call wait_for_drain() for any queue that
> uses IB_POLL_DIRECT. This means that the "if (cq->poll_ctx ==
> IB_POLL_DIRECT)" part is untested. Please do propose untested code for
> upstream inclusion.
> 

I guess you're making your point to not post code that has no user...



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bart Van Assche Feb. 11, 2016, 11:23 p.m. UTC | #5
On 02/11/2016 03:20 PM, Steve Wise wrote:
> I guess you're making your point to not post code that has no user...

How about using WARN_ONCE(cq->poll_ctx == IB_POLL_DIRECT) instead ?

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise Feb. 12, 2016, 12:32 a.m. UTC | #6
On 2/11/2016 5:23 PM, Bart Van Assche wrote:
> On 02/11/2016 03:20 PM, Steve Wise wrote:
>> I guess you're making your point to not post code that has no user...
>
> How about using WARN_ONCE(cq->poll_ctx == IB_POLL_DIRECT) instead ?

That's fine with me.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Devesh Sharma Feb. 12, 2016, 5:19 a.m. UTC | #7
On Sat, Feb 6, 2016 at 2:43 AM, Steve Wise <swise@opengridcomputing.com> wrote:
> From: Steve Wise <swise@opengridcomputing.com.com>
>
> Add provider-specific drain_sq/drain_rq functions for providers needing
> special drain logic.
>
> Add static functions __ib_drain_sq() and __ib_drain_rq() which post noop
> WRs to the SQ or RQ and block until their completions are processed.
> This ensures the applications completions have all been processed.
>
> Add API functions ib_drain_sq(), ib_drain_rq(), and ib_drain_qp().
>
> Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
> Signed-off-by: Steve Wise <swise@opengridcomputing.com>
> ---
>  drivers/infiniband/core/verbs.c | 144 ++++++++++++++++++++++++++++++++++++++++
>  include/rdma/ib_verbs.h         |   5 ++
>  2 files changed, 149 insertions(+)
>
> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
> index 5af6d02..aed521e 100644
> --- a/drivers/infiniband/core/verbs.c
> +++ b/drivers/infiniband/core/verbs.c
> @@ -1657,3 +1657,147 @@ next_page:
>         return i;
>  }
>  EXPORT_SYMBOL(ib_sg_to_pages);
> +
> +struct ib_drain_cqe {
> +       struct ib_cqe cqe;
> +       struct completion done;
> +};
> +
> +static void ib_drain_qp_done(struct ib_cq *cq, struct ib_wc *wc)
> +{
> +       struct ib_drain_cqe *cqe = container_of(wc->wr_cqe, struct ib_drain_cqe,
> +                                               cqe);
> +
> +       complete(&cqe->done);
> +}
> +
> +static void wait_for_drain(struct ib_cq *cq, struct completion *c)
> +{
> +       if (cq->poll_ctx == IB_POLL_DIRECT)
> +               do
> +                       ib_process_cq_direct(cq, 1024);
> +               while (!wait_for_completion_timeout(c, msecs_to_jiffies(100)));
> +       else
> +               wait_for_completion(c);
> +}
> +
> +/*
> + * Post a WR and block until its completion is reaped for the SQ.
> + */
> +static void __ib_drain_sq(struct ib_qp *qp)
> +{
> +       struct ib_qp_attr attr = { .qp_state = IB_QPS_ERR };
> +       struct ib_drain_cqe sdrain;
> +       struct ib_send_wr swr = {}, *bad_swr;
> +       int ret;
> +
> +       swr.wr_cqe = &sdrain.cqe;
> +       sdrain.cqe.done = ib_drain_qp_done;
> +       init_completion(&sdrain.done);
> +
> +       ret = ib_modify_qp(qp, &attr, IB_QP_STATE);
> +       if (ret) {
> +               WARN_ONCE(ret, "failed to drain QP: %d\n", ret);
> +               return;
> +       }

Should it query qp-state and then modify if needed? I am not liking the
the extra step which is highly unlikely in a normal usage conditions?
especially for a shared CQ it is really redundant.

> +
> +       ret = ib_post_send(qp, &swr, &bad_swr);
> +       if (ret) {
> +               WARN_ONCE(ret, "failed to drain send queue: %d\n", ret);
> +               return;
> +       }
> +
> +       wait_for_drain(qp->send_cq, &sdrain.done);
> +}
> +
> +/*
> + * Post a WR and block until its completion is reaped for the RQ.
> + */
> +static void __ib_drain_rq(struct ib_qp *qp)
> +{
> +       struct ib_qp_attr attr = { .qp_state = IB_QPS_ERR };
> +       struct ib_drain_cqe rdrain;
> +       struct ib_recv_wr rwr = {}, *bad_rwr;
> +       int ret;
> +
> +       rwr.wr_cqe = &rdrain.cqe;
> +       rdrain.cqe.done = ib_drain_qp_done;
> +       init_completion(&rdrain.done);
> +
> +       ret = ib_modify_qp(qp, &attr, IB_QP_STATE);
> +       if (ret) {
> +               WARN_ONCE(ret, "failed to drain QP: %d\n", ret);
> +               return;
> +       }
> +
> +       ret = ib_post_recv(qp, &rwr, &bad_rwr);
> +       if (ret) {
> +               WARN_ONCE(ret, "failed to drain send queue: %d\n", ret);
> +               return;
> +       }
> +
> +       wait_for_drain(qp->recv_cq, &rdrain.done);
> +}
> +
> +/**
> + * ib_drain_sq() - Block until all SQ CQEs have been consumed by the
> + *                application.
> + * @qp:            queue pair to drain
> + *
> + * If the device has a provider-specific drain function, then
> + * call that.  Otherwise call the generic drain function
> + * __ib_drain_sq().
> + *
> + * The consumer must ensure there is room in the CQ and SQ
> + * for a drain work requests.  Also the consumer must allocate the
> + * CQ using ib_alloc_cq().  It is up to the consumer to handle concurrency
> + * issues if the CQs are using the IB_POLL_DIRECT polling context.
> + */
> +void ib_drain_sq(struct ib_qp *qp)
> +{
> +       if (qp->device->drain_sq)
> +               qp->device->drain_sq(qp);
> +       else
> +               __ib_drain_sq(qp);
> +}
> +EXPORT_SYMBOL(ib_drain_sq);
> +
> +/**
> + * ib_drain_rq() - Block until all RQ CQEs have been consumed by the
> + *                application.
> + * @qp:            queue pair to drain
> + *
> + * If the device has a provider-specific drain function, then
> + * call that.  Otherwise call the generic drain function
> + * __ib_drain_rq().
> + *
> + * The consumer must ensure there is room in the CQ and RQ
> + * for a drain work requests.  Also the consumer must allocate the
> + * CQ using ib_alloc_cq().  It is up to the consumer to handle concurrency
> + * issues if the CQs are using the IB_POLL_DIRECT polling context.
> + */
> +void ib_drain_rq(struct ib_qp *qp)
> +{
> +       if (qp->device->drain_rq)
> +               qp->device->drain_rq(qp);
> +       else
> +               __ib_drain_rq(qp);
> +}
> +EXPORT_SYMBOL(ib_drain_rq);
> +
> +/**
> + * ib_drain_qp() - Block until all CQEs have been consumed by the
> + *                application on both the RQ and SQ.
> + * @qp:            queue pair to drain
> + *
> + * The consumer must ensure there is room in the CQ(s), SQ, and RQ
> + * for a drain work requests.  Also the consumer must allocate the
> + * CQ using ib_alloc_cq().  It is up to the consumer to handle concurrency
> + * issues if the CQs are using the IB_POLL_DIRECT polling context.
> + */
> +void ib_drain_qp(struct ib_qp *qp)
> +{
> +       ib_drain_sq(qp);
> +       ib_drain_rq(qp);
> +}
> +EXPORT_SYMBOL(ib_drain_qp);
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 284b00c..68b7e97 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -1846,6 +1846,8 @@ struct ib_device {
>         int                        (*check_mr_status)(struct ib_mr *mr, u32 check_mask,
>                                                       struct ib_mr_status *mr_status);
>         void                       (*disassociate_ucontext)(struct ib_ucontext *ibcontext);
> +       void                       (*drain_rq)(struct ib_qp *qp);
> +       void                       (*drain_sq)(struct ib_qp *qp);
>
>         struct ib_dma_mapping_ops   *dma_ops;
>
> @@ -3094,4 +3096,7 @@ int ib_sg_to_pages(struct ib_mr *mr,
>                    int sg_nents,
>                    int (*set_page)(struct ib_mr *, u64));
>
> +void ib_drain_rq(struct ib_qp *qp);
> +void ib_drain_sq(struct ib_qp *qp);
> +void ib_drain_qp(struct ib_qp *qp);
>  #endif /* IB_VERBS_H */
> --
> 2.7.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise Feb. 12, 2016, 3:17 p.m. UTC | #8
> > +/*
> > + * Post a WR and block until its completion is reaped for the SQ.
> > + */
> > +static void __ib_drain_sq(struct ib_qp *qp)
> > +{
> > +       struct ib_qp_attr attr = { .qp_state = IB_QPS_ERR };
> > +       struct ib_drain_cqe sdrain;
> > +       struct ib_send_wr swr = {}, *bad_swr;
> > +       int ret;
> > +
> > +       swr.wr_cqe = &sdrain.cqe;
> > +       sdrain.cqe.done = ib_drain_qp_done;
> > +       init_completion(&sdrain.done);
> > +
> > +       ret = ib_modify_qp(qp, &attr, IB_QP_STATE);
> > +       if (ret) {
> > +               WARN_ONCE(ret, "failed to drain QP: %d\n", ret);
> > +               return;
> > +       }
> 
> Should it query qp-state and then modify if needed? I am not liking the
> the extra step which is highly unlikely in a normal usage conditions?
> especially for a shared CQ it is really redundant.
>

Hey Devesh, on a previous round of reviews for this series, I thought you agreed that this was fine? 

See: http://www.spinics.net/lists/linux-rdma/msg33202.html

My personal opinion is that calling query_qp() and only modifying if needed is as much work as just calling modify since ERR->ERR is basically a noop.

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Devesh Sharma Feb. 13, 2016, 3:41 p.m. UTC | #9
Hi Steve,


On Fri, Feb 12, 2016 at 8:47 PM, Steve Wise <swise@opengridcomputing.com> wrote:
>> > +/*
>> > + * Post a WR and block until its completion is reaped for the SQ.
>> > + */
>> > +static void __ib_drain_sq(struct ib_qp *qp)
>> > +{
>> > +       struct ib_qp_attr attr = { .qp_state = IB_QPS_ERR };
>> > +       struct ib_drain_cqe sdrain;
>> > +       struct ib_send_wr swr = {}, *bad_swr;
>> > +       int ret;
>> > +
>> > +       swr.wr_cqe = &sdrain.cqe;
>> > +       sdrain.cqe.done = ib_drain_qp_done;
>> > +       init_completion(&sdrain.done);
>> > +
>> > +       ret = ib_modify_qp(qp, &attr, IB_QP_STATE);
>> > +       if (ret) {
>> > +               WARN_ONCE(ret, "failed to drain QP: %d\n", ret);
>> > +               return;
>> > +       }
>>
>> Should it query qp-state and then modify if needed? I am not liking the
>> the extra step which is highly unlikely in a normal usage conditions?
>> especially for a shared CQ it is really redundant.
>>
>
> Hey Devesh, on a previous round of reviews for this series, I thought you agreed that this was fine?

Yes, I agreed, I was trying to give it second thought and just posted
what I had. :)

>
> See: http://www.spinics.net/lists/linux-rdma/msg33202.html
>
> My personal opinion is that calling query_qp() and only modifying if needed is as much work as just calling modify since ERR->ERR is basically a noop.

I agree that query_qp() is an overkill.

>
> Steve.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Doug Ledford Feb. 15, 2016, 9:05 p.m. UTC | #10
On 02/05/2016 04:13 PM, Steve Wise wrote:
> From: Steve Wise <swise@opengridcomputing.com.com>
> 
> Add provider-specific drain_sq/drain_rq functions for providers needing
> special drain logic.
> 
> Add static functions __ib_drain_sq() and __ib_drain_rq() which post noop
> WRs to the SQ or RQ and block until their completions are processed.
> This ensures the applications completions have all been processed.

Except it doesn't, comments inline below...and applications is
possessive, so needs a '

> Add API functions ib_drain_sq(), ib_drain_rq(), and ib_drain_qp().
> 
> Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
> Signed-off-by: Steve Wise <swise@opengridcomputing.com>
> ---
>  drivers/infiniband/core/verbs.c | 144 ++++++++++++++++++++++++++++++++++++++++
>  include/rdma/ib_verbs.h         |   5 ++
>  2 files changed, 149 insertions(+)
> 
> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
> index 5af6d02..aed521e 100644
> --- a/drivers/infiniband/core/verbs.c
> +++ b/drivers/infiniband/core/verbs.c
> @@ -1657,3 +1657,147 @@ next_page:
>  	return i;
>  }
>  EXPORT_SYMBOL(ib_sg_to_pages);
> +
> +struct ib_drain_cqe {
> +	struct ib_cqe cqe;
> +	struct completion done;
> +};
> +
> +static void ib_drain_qp_done(struct ib_cq *cq, struct ib_wc *wc)
> +{
> +	struct ib_drain_cqe *cqe = container_of(wc->wr_cqe, struct ib_drain_cqe,
> +						cqe);
> +
> +	complete(&cqe->done);
> +}
> +
> +static void wait_for_drain(struct ib_cq *cq, struct completion *c)
> +{
> +	if (cq->poll_ctx == IB_POLL_DIRECT)
> +		do
> +			ib_process_cq_direct(cq, 1024);
> +		while (!wait_for_completion_timeout(c, msecs_to_jiffies(100)));
> +	else
> +		wait_for_completion(c);
> +}
> +
> +/*
> + * Post a WR and block until its completion is reaped for the SQ.
> + */
> +static void __ib_drain_sq(struct ib_qp *qp)
> +{
> +	struct ib_qp_attr attr = { .qp_state = IB_QPS_ERR };
> +	struct ib_drain_cqe sdrain;
> +	struct ib_send_wr swr = {}, *bad_swr;
> +	int ret;
> +
> +	swr.wr_cqe = &sdrain.cqe;
> +	sdrain.cqe.done = ib_drain_qp_done;
> +	init_completion(&sdrain.done);
> +
> +	ret = ib_modify_qp(qp, &attr, IB_QP_STATE);
> +	if (ret) {
> +		WARN_ONCE(ret, "failed to drain QP: %d\n", ret);
> +		return;
> +	}

Set QP to ERR state...OK...

> +	ret = ib_post_send(qp, &swr, &bad_swr);
> +	if (ret) {
> +		WARN_ONCE(ret, "failed to drain send queue: %d\n", ret);
> +		return;
> +	}

Post command to QP in ERR state (admittedly, I never do this and hadn't
even thought about whether or not it is allowed...obviously it is, but
my initial reaction would have been "won't ib_post_send return an error
when you try to post to a QP in ERR state?")....OK...

> +	wait_for_drain(qp->send_cq, &sdrain.done);

Wait for your posted WR to complete...OK...

As I mentioned in my comment above, I would have thought that the
attempt to post a send to a QP in ERR state would have returned an
error.  It must not or else this patch is worthless because of the order
of actions.  What that highlights though, is that this code will drain a
QP, but only if the caller has taken the time to stop all possible
contexts that might run on other cores and post commands to the QP.
Those commands will error out, but the caller must, none the less, take
steps to block other contexts from sending or else this drain is
useless.  That might be fine for the API, but it should be clearly
documented, and currently it isn't.

> +}
> +
> +/*
> + * Post a WR and block until its completion is reaped for the RQ.
> + */
> +static void __ib_drain_rq(struct ib_qp *qp)
> +{
> +	struct ib_qp_attr attr = { .qp_state = IB_QPS_ERR };
> +	struct ib_drain_cqe rdrain;
> +	struct ib_recv_wr rwr = {}, *bad_rwr;
> +	int ret;
> +
> +	rwr.wr_cqe = &rdrain.cqe;
> +	rdrain.cqe.done = ib_drain_qp_done;
> +	init_completion(&rdrain.done);
> +
> +	ret = ib_modify_qp(qp, &attr, IB_QP_STATE);
> +	if (ret) {
> +		WARN_ONCE(ret, "failed to drain QP: %d\n", ret);
> +		return;
> +	}
> +
> +	ret = ib_post_recv(qp, &rwr, &bad_rwr);
> +	if (ret) {
> +		WARN_ONCE(ret, "failed to drain send queue: %d\n", ret);
> +		return;
> +	}
> +
> +	wait_for_drain(qp->recv_cq, &rdrain.done);
> +}
> +
> +/**
> + * ib_drain_sq() - Block until all SQ CQEs have been consumed by the
> + *		   application.
> + * @qp:            queue pair to drain
> + *
> + * If the device has a provider-specific drain function, then
> + * call that.  Otherwise call the generic drain function
> + * __ib_drain_sq().
> + *
> + * The consumer must ensure there is room in the CQ and SQ
> + * for a drain work requests.  Also the consumer must allocate the

Either requests should be singular, or you should remove the article 'a'.

> + * CQ using ib_alloc_cq().  It is up to the consumer to handle concurrency
> + * issues if the CQs are using the IB_POLL_DIRECT polling context.
> + */
> +void ib_drain_sq(struct ib_qp *qp)
> +{
> +	if (qp->device->drain_sq)
> +		qp->device->drain_sq(qp);
> +	else
> +		__ib_drain_sq(qp);
> +}
> +EXPORT_SYMBOL(ib_drain_sq);
> +
> +/**
> + * ib_drain_rq() - Block until all RQ CQEs have been consumed by the
> + *		   application.
> + * @qp:            queue pair to drain
> + *
> + * If the device has a provider-specific drain function, then
> + * call that.  Otherwise call the generic drain function
> + * __ib_drain_rq().
> + *
> + * The consumer must ensure there is room in the CQ and RQ
> + * for a drain work requests.  Also the consumer must allocate the

Ditto...

> + * CQ using ib_alloc_cq().  It is up to the consumer to handle concurrency
> + * issues if the CQs are using the IB_POLL_DIRECT polling context.
> + */
> +void ib_drain_rq(struct ib_qp *qp)
> +{
> +	if (qp->device->drain_rq)
> +		qp->device->drain_rq(qp);
> +	else
> +		__ib_drain_rq(qp);
> +}
> +EXPORT_SYMBOL(ib_drain_rq);
> +
> +/**
> + * ib_drain_qp() - Block until all CQEs have been consumed by the
> + *		   application on both the RQ and SQ.
> + * @qp:            queue pair to drain
> + *
> + * The consumer must ensure there is room in the CQ(s), SQ, and RQ
> + * for a drain work requests.  Also the consumer must allocate the

Ditto...

> + * CQ using ib_alloc_cq().  It is up to the consumer to handle concurrency
> + * issues if the CQs are using the IB_POLL_DIRECT polling context.
> + */
> +void ib_drain_qp(struct ib_qp *qp)
> +{
> +	ib_drain_sq(qp);
> +	ib_drain_rq(qp);
> +}
> +EXPORT_SYMBOL(ib_drain_qp);
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 284b00c..68b7e97 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -1846,6 +1846,8 @@ struct ib_device {
>  	int			   (*check_mr_status)(struct ib_mr *mr, u32 check_mask,
>  						      struct ib_mr_status *mr_status);
>  	void			   (*disassociate_ucontext)(struct ib_ucontext *ibcontext);
> +	void			   (*drain_rq)(struct ib_qp *qp);
> +	void			   (*drain_sq)(struct ib_qp *qp);
>  
>  	struct ib_dma_mapping_ops   *dma_ops;
>  
> @@ -3094,4 +3096,7 @@ int ib_sg_to_pages(struct ib_mr *mr,
>  		   int sg_nents,
>  		   int (*set_page)(struct ib_mr *, u64));
>  
> +void ib_drain_rq(struct ib_qp *qp);
> +void ib_drain_sq(struct ib_qp *qp);
> +void ib_drain_qp(struct ib_qp *qp);
>  #endif /* IB_VERBS_H */
>
Steve Wise Feb. 15, 2016, 9:20 p.m. UTC | #11
> -----Original Message-----
> From: Doug Ledford [mailto:dledford@redhat.com]
> Sent: Monday, February 15, 2016 3:06 PM
> To: Steve Wise; linux-rdma@vger.kernel.org
> Cc: bart.vanassche@sandisk.com
> Subject: Re: [PATCH 1/3] IB: new common API for draining queues
> 
> On 02/05/2016 04:13 PM, Steve Wise wrote:
> > From: Steve Wise <swise@opengridcomputing.com.com>
> >
> > Add provider-specific drain_sq/drain_rq functions for providers needing
> > special drain logic.
> >
> > Add static functions __ib_drain_sq() and __ib_drain_rq() which post noop
> > WRs to the SQ or RQ and block until their completions are processed.
> > This ensures the applications completions have all been processed.
> 
> Except it doesn't, comments inline below...and applications is
> possessive, so needs a '
> 
> > Add API functions ib_drain_sq(), ib_drain_rq(), and ib_drain_qp().
> >
> > Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
> > Signed-off-by: Steve Wise <swise@opengridcomputing.com>
> > ---
> >  drivers/infiniband/core/verbs.c | 144 ++++++++++++++++++++++++++++++++++++++++
> >  include/rdma/ib_verbs.h         |   5 ++
> >  2 files changed, 149 insertions(+)
> >
> > diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
> > index 5af6d02..aed521e 100644
> > --- a/drivers/infiniband/core/verbs.c
> > +++ b/drivers/infiniband/core/verbs.c
> > @@ -1657,3 +1657,147 @@ next_page:
> >  	return i;
> >  }
> >  EXPORT_SYMBOL(ib_sg_to_pages);
> > +
> > +struct ib_drain_cqe {
> > +	struct ib_cqe cqe;
> > +	struct completion done;
> > +};
> > +
> > +static void ib_drain_qp_done(struct ib_cq *cq, struct ib_wc *wc)
> > +{
> > +	struct ib_drain_cqe *cqe = container_of(wc->wr_cqe, struct ib_drain_cqe,
> > +						cqe);
> > +
> > +	complete(&cqe->done);
> > +}
> > +
> > +static void wait_for_drain(struct ib_cq *cq, struct completion *c)
> > +{
> > +	if (cq->poll_ctx == IB_POLL_DIRECT)
> > +		do
> > +			ib_process_cq_direct(cq, 1024);
> > +		while (!wait_for_completion_timeout(c, msecs_to_jiffies(100)));
> > +	else
> > +		wait_for_completion(c);
> > +}
> > +
> > +/*
> > + * Post a WR and block until its completion is reaped for the SQ.
> > + */
> > +static void __ib_drain_sq(struct ib_qp *qp)
> > +{
> > +	struct ib_qp_attr attr = { .qp_state = IB_QPS_ERR };
> > +	struct ib_drain_cqe sdrain;
> > +	struct ib_send_wr swr = {}, *bad_swr;
> > +	int ret;
> > +
> > +	swr.wr_cqe = &sdrain.cqe;
> > +	sdrain.cqe.done = ib_drain_qp_done;
> > +	init_completion(&sdrain.done);
> > +
> > +	ret = ib_modify_qp(qp, &attr, IB_QP_STATE);
> > +	if (ret) {
> > +		WARN_ONCE(ret, "failed to drain QP: %d\n", ret);
> > +		return;
> > +	}
> 
> Set QP to ERR state...OK...
> 
> > +	ret = ib_post_send(qp, &swr, &bad_swr);
> > +	if (ret) {
> > +		WARN_ONCE(ret, "failed to drain send queue: %d\n", ret);
> > +		return;
> > +	}
> 
> Post command to QP in ERR state (admittedly, I never do this and hadn't
> even thought about whether or not it is allowed...obviously it is, but
> my initial reaction would have been "won't ib_post_send return an error
> when you try to post to a QP in ERR state?")....OK...
> 

IBTA spec sez a post to a QP in ERR will result in a CQE in the CQ with status FLUSHED.  IE it mandates this behavior.

> > +	wait_for_drain(qp->send_cq, &sdrain.done);
> 
> Wait for your posted WR to complete...OK...
> 
> As I mentioned in my comment above, I would have thought that the
> attempt to post a send to a QP in ERR state would have returned an
> error.  It must not or else this patch is worthless because of the order
> of actions.  What that highlights though, is that this code will drain a
> QP, but only if the caller has taken the time to stop all possible
> contexts that might run on other cores and post commands to the QP.
> Those commands will error out, but the caller must, none the less, take
> steps to block other contexts from sending or else this drain is
> useless.  That might be fine for the API, but it should be clearly
> documented, and currently it isn't.
> 

I'll clarify this in the function header comments.

> > +}
> > +
> > +/*
> > + * Post a WR and block until its completion is reaped for the RQ.
> > + */
> > +static void __ib_drain_rq(struct ib_qp *qp)
> > +{
> > +	struct ib_qp_attr attr = { .qp_state = IB_QPS_ERR };
> > +	struct ib_drain_cqe rdrain;
> > +	struct ib_recv_wr rwr = {}, *bad_rwr;
> > +	int ret;
> > +
> > +	rwr.wr_cqe = &rdrain.cqe;
> > +	rdrain.cqe.done = ib_drain_qp_done;
> > +	init_completion(&rdrain.done);
> > +
> > +	ret = ib_modify_qp(qp, &attr, IB_QP_STATE);
> > +	if (ret) {
> > +		WARN_ONCE(ret, "failed to drain QP: %d\n", ret);
> > +		return;
> > +	}
> > +
> > +	ret = ib_post_recv(qp, &rwr, &bad_rwr);
> > +	if (ret) {
> > +		WARN_ONCE(ret, "failed to drain send queue: %d\n", ret);
> > +		return;
> > +	}
> > +
> > +	wait_for_drain(qp->recv_cq, &rdrain.done);
> > +}
> > +
> > +/**
> > + * ib_drain_sq() - Block until all SQ CQEs have been consumed by the
> > + *		   application.
> > + * @qp:            queue pair to drain
> > + *
> > + * If the device has a provider-specific drain function, then
> > + * call that.  Otherwise call the generic drain function
> > + * __ib_drain_sq().
> > + *
> > + * The consumer must ensure there is room in the CQ and SQ
> > + * for a drain work requests.  Also the consumer must allocate the
> 
> Either requests should be singular, or you should remove the article 'a'.
> 

agreed (and agree to all the dittos)


> > + * CQ using ib_alloc_cq().  It is up to the consumer to handle concurrency
> > + * issues if the CQs are using the IB_POLL_DIRECT polling context.
> > + */
> > +void ib_drain_sq(struct ib_qp *qp)
> > +{
> > +	if (qp->device->drain_sq)
> > +		qp->device->drain_sq(qp);
> > +	else
> > +		__ib_drain_sq(qp);
> > +}
> > +EXPORT_SYMBOL(ib_drain_sq);
> > +
> > +/**
> > + * ib_drain_rq() - Block until all RQ CQEs have been consumed by the
> > + *		   application.
> > + * @qp:            queue pair to drain
> > + *
> > + * If the device has a provider-specific drain function, then
> > + * call that.  Otherwise call the generic drain function
> > + * __ib_drain_rq().
> > + *
> > + * The consumer must ensure there is room in the CQ and RQ
> > + * for a drain work requests.  Also the consumer must allocate the
> 
> Ditto...
> 
> > + * CQ using ib_alloc_cq().  It is up to the consumer to handle concurrency
> > + * issues if the CQs are using the IB_POLL_DIRECT polling context.
> > + */
> > +void ib_drain_rq(struct ib_qp *qp)
> > +{
> > +	if (qp->device->drain_rq)
> > +		qp->device->drain_rq(qp);
> > +	else
> > +		__ib_drain_rq(qp);
> > +}
> > +EXPORT_SYMBOL(ib_drain_rq);
> > +
> > +/**
> > + * ib_drain_qp() - Block until all CQEs have been consumed by the
> > + *		   application on both the RQ and SQ.
> > + * @qp:            queue pair to drain
> > + *
> > + * The consumer must ensure there is room in the CQ(s), SQ, and RQ
> > + * for a drain work requests.  Also the consumer must allocate the
> 
> Ditto...
> 
> > + * CQ using ib_alloc_cq().  It is up to the consumer to handle concurrency
> > + * issues if the CQs are using the IB_POLL_DIRECT polling context.
> > + */
> > +void ib_drain_qp(struct ib_qp *qp)
> > +{
> > +	ib_drain_sq(qp);
> > +	ib_drain_rq(qp);
> > +}
> > +EXPORT_SYMBOL(ib_drain_qp);
> > diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> > index 284b00c..68b7e97 100644
> > --- a/include/rdma/ib_verbs.h
> > +++ b/include/rdma/ib_verbs.h
> > @@ -1846,6 +1846,8 @@ struct ib_device {
> >  	int			   (*check_mr_status)(struct ib_mr *mr, u32 check_mask,
> >  						      struct ib_mr_status *mr_status);
> >  	void			   (*disassociate_ucontext)(struct ib_ucontext *ibcontext);
> > +	void			   (*drain_rq)(struct ib_qp *qp);
> > +	void			   (*drain_sq)(struct ib_qp *qp);
> >
> >  	struct ib_dma_mapping_ops   *dma_ops;
> >
> > @@ -3094,4 +3096,7 @@ int ib_sg_to_pages(struct ib_mr *mr,
> >  		   int sg_nents,
> >  		   int (*set_page)(struct ib_mr *, u64));
> >
> > +void ib_drain_rq(struct ib_qp *qp);
> > +void ib_drain_sq(struct ib_qp *qp);
> > +void ib_drain_qp(struct ib_qp *qp);
> >  #endif /* IB_VERBS_H */
> >
> 
> 
> --
> Doug Ledford <dledford@redhat.com>
>               GPG KeyID: 0E572FDD
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg Feb. 16, 2016, 11 a.m. UTC | #12
> As I mentioned in my comment above, I would have thought that the
> attempt to post a send to a QP in ERR state would have returned an
> error.  It must not or else this patch is worthless because of the order
> of actions.  What that highlights though, is that this code will drain a
> QP, but only if the caller has taken the time to stop all possible
> contexts that might run on other cores and post commands to the QP.
> Those commands will error out, but the caller must, none the less, take
> steps to block other contexts from sending or else this drain is
> useless.  That might be fine for the API, but it should be clearly
> documented, and currently it isn't.

I agree, it should be documented that if there are other contexts that
post concurrently then the QP drain is not guaranteed. That's a valid
requirement I think...
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 5af6d02..aed521e 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1657,3 +1657,147 @@  next_page:
 	return i;
 }
 EXPORT_SYMBOL(ib_sg_to_pages);
+
+struct ib_drain_cqe {
+	struct ib_cqe cqe;
+	struct completion done;
+};
+
+static void ib_drain_qp_done(struct ib_cq *cq, struct ib_wc *wc)
+{
+	struct ib_drain_cqe *cqe = container_of(wc->wr_cqe, struct ib_drain_cqe,
+						cqe);
+
+	complete(&cqe->done);
+}
+
+static void wait_for_drain(struct ib_cq *cq, struct completion *c)
+{
+	if (cq->poll_ctx == IB_POLL_DIRECT)
+		do
+			ib_process_cq_direct(cq, 1024);
+		while (!wait_for_completion_timeout(c, msecs_to_jiffies(100)));
+	else
+		wait_for_completion(c);
+}
+
+/*
+ * Post a WR and block until its completion is reaped for the SQ.
+ */
+static void __ib_drain_sq(struct ib_qp *qp)
+{
+	struct ib_qp_attr attr = { .qp_state = IB_QPS_ERR };
+	struct ib_drain_cqe sdrain;
+	struct ib_send_wr swr = {}, *bad_swr;
+	int ret;
+
+	swr.wr_cqe = &sdrain.cqe;
+	sdrain.cqe.done = ib_drain_qp_done;
+	init_completion(&sdrain.done);
+
+	ret = ib_modify_qp(qp, &attr, IB_QP_STATE);
+	if (ret) {
+		WARN_ONCE(ret, "failed to drain QP: %d\n", ret);
+		return;
+	}
+
+	ret = ib_post_send(qp, &swr, &bad_swr);
+	if (ret) {
+		WARN_ONCE(ret, "failed to drain send queue: %d\n", ret);
+		return;
+	}
+
+	wait_for_drain(qp->send_cq, &sdrain.done);
+}
+
+/*
+ * Post a WR and block until its completion is reaped for the RQ.
+ */
+static void __ib_drain_rq(struct ib_qp *qp)
+{
+	struct ib_qp_attr attr = { .qp_state = IB_QPS_ERR };
+	struct ib_drain_cqe rdrain;
+	struct ib_recv_wr rwr = {}, *bad_rwr;
+	int ret;
+
+	rwr.wr_cqe = &rdrain.cqe;
+	rdrain.cqe.done = ib_drain_qp_done;
+	init_completion(&rdrain.done);
+
+	ret = ib_modify_qp(qp, &attr, IB_QP_STATE);
+	if (ret) {
+		WARN_ONCE(ret, "failed to drain QP: %d\n", ret);
+		return;
+	}
+
+	ret = ib_post_recv(qp, &rwr, &bad_rwr);
+	if (ret) {
+		WARN_ONCE(ret, "failed to drain send queue: %d\n", ret);
+		return;
+	}
+
+	wait_for_drain(qp->recv_cq, &rdrain.done);
+}
+
+/**
+ * ib_drain_sq() - Block until all SQ CQEs have been consumed by the
+ *		   application.
+ * @qp:            queue pair to drain
+ *
+ * If the device has a provider-specific drain function, then
+ * call that.  Otherwise call the generic drain function
+ * __ib_drain_sq().
+ *
+ * The consumer must ensure there is room in the CQ and SQ
+ * for a drain work requests.  Also the consumer must allocate the
+ * CQ using ib_alloc_cq().  It is up to the consumer to handle concurrency
+ * issues if the CQs are using the IB_POLL_DIRECT polling context.
+ */
+void ib_drain_sq(struct ib_qp *qp)
+{
+	if (qp->device->drain_sq)
+		qp->device->drain_sq(qp);
+	else
+		__ib_drain_sq(qp);
+}
+EXPORT_SYMBOL(ib_drain_sq);
+
+/**
+ * ib_drain_rq() - Block until all RQ CQEs have been consumed by the
+ *		   application.
+ * @qp:            queue pair to drain
+ *
+ * If the device has a provider-specific drain function, then
+ * call that.  Otherwise call the generic drain function
+ * __ib_drain_rq().
+ *
+ * The consumer must ensure there is room in the CQ and RQ
+ * for a drain work requests.  Also the consumer must allocate the
+ * CQ using ib_alloc_cq().  It is up to the consumer to handle concurrency
+ * issues if the CQs are using the IB_POLL_DIRECT polling context.
+ */
+void ib_drain_rq(struct ib_qp *qp)
+{
+	if (qp->device->drain_rq)
+		qp->device->drain_rq(qp);
+	else
+		__ib_drain_rq(qp);
+}
+EXPORT_SYMBOL(ib_drain_rq);
+
+/**
+ * ib_drain_qp() - Block until all CQEs have been consumed by the
+ *		   application on both the RQ and SQ.
+ * @qp:            queue pair to drain
+ *
+ * The consumer must ensure there is room in the CQ(s), SQ, and RQ
+ * for a drain work requests.  Also the consumer must allocate the
+ * CQ using ib_alloc_cq().  It is up to the consumer to handle concurrency
+ * issues if the CQs are using the IB_POLL_DIRECT polling context.
+ */
+void ib_drain_qp(struct ib_qp *qp)
+{
+	ib_drain_sq(qp);
+	ib_drain_rq(qp);
+}
+EXPORT_SYMBOL(ib_drain_qp);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 284b00c..68b7e97 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1846,6 +1846,8 @@  struct ib_device {
 	int			   (*check_mr_status)(struct ib_mr *mr, u32 check_mask,
 						      struct ib_mr_status *mr_status);
 	void			   (*disassociate_ucontext)(struct ib_ucontext *ibcontext);
+	void			   (*drain_rq)(struct ib_qp *qp);
+	void			   (*drain_sq)(struct ib_qp *qp);
 
 	struct ib_dma_mapping_ops   *dma_ops;
 
@@ -3094,4 +3096,7 @@  int ib_sg_to_pages(struct ib_mr *mr,
 		   int sg_nents,
 		   int (*set_page)(struct ib_mr *, u64));
 
+void ib_drain_rq(struct ib_qp *qp);
+void ib_drain_sq(struct ib_qp *qp);
+void ib_drain_qp(struct ib_qp *qp);
 #endif /* IB_VERBS_H */