From patchwork Sat Jan 16 15:55:58 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 8049251 Return-Path: X-Original-To: patchwork-linux-rdma@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id EA4079F716 for ; Sat, 16 Jan 2016 15:56:50 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 193312034C for ; Sat, 16 Jan 2016 15:56:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id DAE0720357 for ; Sat, 16 Jan 2016 15:56:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751105AbcAPP4k (ORCPT ); Sat, 16 Jan 2016 10:56:40 -0500 Received: from mail-wm0-f52.google.com ([74.125.82.52]:33975 "EHLO mail-wm0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752053AbcAPP4h (ORCPT ); Sat, 16 Jan 2016 10:56:37 -0500 Received: by mail-wm0-f52.google.com with SMTP id u188so56317535wmu.1 for ; Sat, 16 Jan 2016 07:56:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=leon-nu.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-type:content-transfer-encoding; bh=EsyKZVXnCzypw05xuHURe3GPRYLDl3hfbqkdzHdG8m0=; b=OMoGfprw4qB+2NHEg2S7VdBCMJjtUgdCbP0XUJ95sgLzXuPBVZC2EyQvPffSIGQPNq cg+J1XVnbeYNjUXVuOuYqHlVYqzSEsnZ7YfvGqIs6MvN0zbCS9Jd8Z6b4Up16zxF3xis 5M7HEmieJ5J0D1OzLPEqzbEbdvc1u67gWmK07GIAzu4L+bF3tFnh/EvjwB3maF45Vc0T jt3/2YTatdXj3/HZWKpsQFxehaD1HaD/5LgkAcW3XmebuHoUveUvWwc3sjEogwH/parW 1w4JevN/COlvT/Y8YQDu+QBqNP2qi4SOivbuqErueYYqT9lmB591AdicI5sbE218xKNZ yJuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-type:content-transfer-encoding; bh=EsyKZVXnCzypw05xuHURe3GPRYLDl3hfbqkdzHdG8m0=; b=KtJwjyIJqUWiWHpHvaD2YyVb/rlzUixTHLwKpi6GemA5PJ2+vitZIVw9DL+PzPZpc9 Vg6jl2IQGAfQ/Uwwqt78OIvCp65GML1RblZ5IbfmwpUowmw+83itHp+4jIxxSC4cr0+k XcmEQx5C/hB+Hxk7DCwk1G7SSmiwIJRrH5ASFS8Lbjn43sk6pDI25Fd8xwSa88n/C/uS hao3wYYC7xeIAHDETQSIqUYQTU2lbpfm09ATWCV02MLpNUQ3P7libDw/+TC0rn4gD/e9 uwrNGJhzoAnGFEyxSANKMZrCLEND2hBLFxCg/K+y0CT7lXeI7hNaJ6mPIhRfJ0Ek+h8p DmhQ== X-Gm-Message-State: AG10YOQOXVWrG6Hl/udgvQrlEsAdh68P7OQ6Z41EKthF2TY2hTVIi+NGbq3qFkwwj5tbzA== X-Received: by 10.28.137.135 with SMTP id l129mr4150676wmd.38.1452959796334; Sat, 16 Jan 2016 07:56:36 -0800 (PST) Received: from localhost ([213.57.247.249]) by smtp.gmail.com with ESMTPSA id v82sm7430225wmv.12.2016.01.16.07.56.35 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 16 Jan 2016 07:56:35 -0800 (PST) From: Leon Romanovsky To: yishaih@mellanox.com Cc: linux-rdma@vger.kernel.org, Leon Romanovsky Subject: [PATCH libmlx5 V1 2/2] Add cross-channel work request opcodes Date: Sat, 16 Jan 2016 17:55:58 +0200 Message-Id: <1452959758-29611-3-git-send-email-leon@leon.nu> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1452959758-29611-1-git-send-email-leon@leon.nu> References: <1452959758-29611-1-git-send-email-leon@leon.nu> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,T_DKIM_INVALID,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Leon Romanovsky The cross-channel feature relies on special primitives to send and receive work requests. * WAIT on CQ WR - This work request holds execution of subsequent requests on that queue until this number of completions of a CQ is met. • SEND_EN WR - This work request specifies value of producer index on the controlled send queue. It enables the execution of all WQEs up to the work request which is marked by IBV_SEND_WAIT_EN_LAST flag. • RECEIVE_EN WR - Same as SEND_EN but related to a receive queue. Signed-off-by: Leon Romanovsky Reviewed-by: Sagi Grimberg --- src/mlx5.h | 9 ++++++ src/qp.c | 100 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++---- src/verbs.c | 14 +++++++++ src/wqe.h | 5 +++ 4 files changed, 122 insertions(+), 6 deletions(-) diff --git a/src/mlx5.h b/src/mlx5.h index 38f5f518a94b..a8e1ad6dda74 100644 --- a/src/mlx5.h +++ b/src/mlx5.h @@ -208,6 +208,10 @@ enum { MLX5_OPCODE_LOCAL_INVAL = 0x1b, MLX5_OPCODE_CONFIG_CMD = 0x1f, + MLX5_OPCODE_SEND_ENABLE = 0x17, + MLX5_OPCODE_RECV_ENABLE = 0x16, + MLX5_OPCODE_CQE_WAIT = 0x0f, + MLX5_RECV_OPCODE_RDMA_WRITE_IMM = 0x00, MLX5_RECV_OPCODE_SEND = 0x01, MLX5_RECV_OPCODE_SEND_IMM = 0x02, @@ -368,6 +372,8 @@ struct mlx5_cq { uint64_t stall_last_count; int stall_adaptive_enable; int stall_cycles; + uint32_t wait_index; + uint32_t wait_count; }; struct mlx5_srq { @@ -405,6 +411,8 @@ struct mlx5_wq { int wqe_shift; int offset; void *qend; + uint32_t head_en_index; + uint32_t head_en_count; }; struct mlx5_bf { @@ -437,6 +445,7 @@ struct mlx5_qp { uint32_t *db; struct mlx5_wq rq; int wq_sig; + uint32_t create_flags; }; struct mlx5_av { diff --git a/src/qp.c b/src/qp.c index 67ded0d197d3..f84684e69d86 100644 --- a/src/qp.c +++ b/src/qp.c @@ -54,8 +54,20 @@ static const uint32_t mlx5_ib_opcode[] = { [IBV_WR_RDMA_READ] = MLX5_OPCODE_RDMA_READ, [IBV_WR_ATOMIC_CMP_AND_SWP] = MLX5_OPCODE_ATOMIC_CS, [IBV_WR_ATOMIC_FETCH_AND_ADD] = MLX5_OPCODE_ATOMIC_FA, + [IBV_WR_SEND_ENABLE] = MLX5_OPCODE_SEND_ENABLE, + [IBV_WR_RECV_ENABLE] = MLX5_OPCODE_RECV_ENABLE, + [IBV_WR_CQE_WAIT] = MLX5_OPCODE_CQE_WAIT }; +static inline void set_wait_en_seg(void *wqe_seg, uint32_t obj_num, uint32_t count) +{ + struct mlx5_wqe_wait_en_seg *seg = (struct mlx5_wqe_wait_en_seg *)wqe_seg; + + seg->pi = htonl(count); + seg->obj_num = htonl(obj_num); + return; +} + static void *get_recv_wqe(struct mlx5_qp *qp, int n) { return qp->buf.buf + qp->rq.offset + (n << qp->rq.wqe_shift); @@ -155,6 +167,10 @@ void mlx5_init_qp_indices(struct mlx5_qp *qp) qp->rq.head = 0; qp->rq.tail = 0; qp->sq.cur_post = 0; + qp->sq.head_en_index = 0; + qp->sq.head_en_count = 0; + qp->rq.head_en_index = 0; + qp->rq.head_en_count = 0; } static int mlx5_wq_overflow(struct mlx5_wq *wq, int nreq, struct mlx5_cq *cq) @@ -336,6 +352,11 @@ int mlx5_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, void *qend = qp->sq.qend; uint32_t mlx5_opcode; struct mlx5_wqe_xrc_seg *xrc; + struct mlx5_cq *wait_cq; + uint32_t wait_index = 0; + unsigned head_en_index; + struct mlx5_wq *wq; + #ifdef MLX5_DEBUG FILE *fp = to_mctx(ibqp->context)->dbg_fp; #endif @@ -352,11 +373,10 @@ int mlx5_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, goto out; } - if (unlikely(mlx5_wq_overflow(&qp->sq, nreq, + if (unlikely(!(qp->create_flags & IBV_QP_CREATE_IGNORE_SQ_OVERFLOW) && mlx5_wq_overflow(&qp->sq, nreq, to_mcq(qp->ibv_qp->send_cq)))) { mlx5_dbg(fp, MLX5_DBG_QP_SEND, "work queue overflow\n"); - errno = ENOMEM; - err = -1; + err = ENOMEM; *bad_wr = wr; goto out; } @@ -409,7 +429,69 @@ int mlx5_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, err = ENOSYS; *bad_wr = wr; goto out; + case IBV_WR_CQE_WAIT: + if (!(qp->create_flags & IBV_QP_CREATE_CROSS_CHANNEL)) { + err = EINVAL; + *bad_wr = wr; + goto out; + } + + wait_cq = to_mcq(wr->wr.cqe_wait.cq); + wait_index = wait_cq->wait_index + wr->wr.cqe_wait.cq_count; + wait_cq->wait_count = max(wait_cq->wait_count, wr->wr.cqe_wait.cq_count); + if (wr->send_flags & IBV_SEND_WAIT_EN_LAST) { + wait_cq->wait_index += wait_cq->wait_count; + wait_cq->wait_count = 0; + } + set_wait_en_seg(seg, wait_cq->cqn, wait_index); + seg += sizeof(struct mlx5_wqe_wait_en_seg); + size += sizeof(struct mlx5_wqe_wait_en_seg) / 16; + break; + case IBV_WR_SEND_ENABLE: + case IBV_WR_RECV_ENABLE: + if (((wr->opcode == IBV_WR_SEND_ENABLE) && + !(to_mqp(wr->wr.wqe_enable.qp)->create_flags & + IBV_QP_CREATE_MANAGED_SEND)) || + ((wr->opcode == IBV_WR_RECV_ENABLE) && + !(to_mqp(wr->wr.wqe_enable.qp)->create_flags & + IBV_QP_CREATE_MANAGED_RECV))) { + err = EINVAL; + *bad_wr = wr; + goto out; + } + + wq = (wr->opcode == IBV_WR_SEND_ENABLE) ? + &to_mqp(wr->wr.wqe_enable.qp)->sq : + &to_mqp(wr->wr.wqe_enable.qp)->rq; + + /* If wqe_count is 0 release all WRs from queue */ + if (wr->wr.wqe_enable.wqe_count) { + head_en_index = wq->head_en_index + + wr->wr.wqe_enable.wqe_count; + wq->head_en_count = max(wq->head_en_count, + wr->wr.wqe_enable.wqe_count); + + if ((int)(wq->head - head_en_index) < 0) { + err = EINVAL; + *bad_wr = wr; + goto out; + } + } else { + head_en_index = wq->head; + wq->head_en_count = wq->head - wq->head_en_index; + } + + if (wr->send_flags & IBV_SEND_WAIT_EN_LAST) { + wq->head_en_index += wq->head_en_count; + wq->head_en_count = 0; + } + + set_wait_en_seg(seg, wr->wr.wqe_enable.qp->qp_num, head_en_index); + + seg += sizeof(struct mlx5_wqe_wait_en_seg); + size += sizeof(struct mlx5_wqe_wait_en_seg) / 16; + break; default: break; } @@ -492,6 +574,11 @@ out: if (likely(nreq)) { qp->sq.head += nreq; + if (qp->create_flags & IBV_QP_CREATE_MANAGED_SEND) { + wmb(); + goto post_send_no_db; + } + /* * Make sure that descriptors are written before * updating doorbell record and ringing the doorbell @@ -528,6 +615,7 @@ out: mlx5_spin_unlock(&bf->lock); } +post_send_no_db: mlx5_spin_unlock(&qp->sq.lock); return err; @@ -561,11 +649,11 @@ int mlx5_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, ind = qp->rq.head & (qp->rq.wqe_cnt - 1); for (nreq = 0; wr; ++nreq, wr = wr->next) { - if (unlikely(mlx5_wq_overflow(&qp->rq, nreq, + if (unlikely(!(qp->create_flags & IBV_QP_CREATE_IGNORE_RQ_OVERFLOW) && + mlx5_wq_overflow(&qp->rq, nreq, to_mcq(qp->ibv_qp->recv_cq)))) { - errno = ENOMEM; + err = ENOMEM; *bad_wr = wr; - err = -1; goto out; } diff --git a/src/verbs.c b/src/verbs.c index 064a500b0a06..15e34488883f 100644 --- a/src/verbs.c +++ b/src/verbs.c @@ -309,6 +309,9 @@ static struct ibv_cq *create_cq(struct ibv_context *context, } cq->cons_index = 0; + /* Cross-channel wait index should start from value below 0 */ + cq->wait_index = (uint32_t)(-1); + cq->wait_count = 0; if (mlx5_spinlock_init(&cq->lock)) goto err; @@ -975,6 +978,17 @@ static int init_attr_v2(struct ibv_context *context, struct mlx5_qp *qp, struct mlx5_create_qp_resp_ex resp; int err; + qp->create_flags = (attr->create_flags & (IBV_QP_CREATE_IGNORE_SQ_OVERFLOW | + IBV_QP_CREATE_IGNORE_RQ_OVERFLOW | + IBV_QP_CREATE_CROSS_CHANNEL | + IBV_QP_CREATE_MANAGED_SEND | + IBV_QP_CREATE_MANAGED_RECV )); + /* + * These QP flags are virtual and don't need to + * be forwarded to the bottom layer. + */ + attr->create_flags &= ~(IBV_QP_CREATE_IGNORE_SQ_OVERFLOW | IBV_QP_CREATE_IGNORE_RQ_OVERFLOW); + memset(&cmd, 0, sizeof(cmd)); memset(&resp, 0, sizeof(resp)); if (qp->wq_sig) diff --git a/src/wqe.h b/src/wqe.h index bd50d9a116e1..73aeb6aedfd9 100644 --- a/src/wqe.h +++ b/src/wqe.h @@ -187,5 +187,10 @@ struct mlx5_wqe_inline_seg { uint32_t byte_count; }; +struct mlx5_wqe_wait_en_seg { + uint8_t rsvd0[8]; + uint32_t pi; + uint32_t obj_num; +}; #endif /* WQE_H */