From patchwork Wed Sep 8 05:29:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bob Pearson X-Patchwork-Id: 12480161 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43EECC433FE for ; Wed, 8 Sep 2021 05:29:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2C25061158 for ; Wed, 8 Sep 2021 05:29:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231491AbhIHFau (ORCPT ); Wed, 8 Sep 2021 01:30:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49158 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245528AbhIHFaq (ORCPT ); Wed, 8 Sep 2021 01:30:46 -0400 Received: from mail-oi1-x22c.google.com (mail-oi1-x22c.google.com [IPv6:2607:f8b0:4864:20::22c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1BAFDC0613D9 for ; Tue, 7 Sep 2021 22:29:39 -0700 (PDT) Received: by mail-oi1-x22c.google.com with SMTP id s20so1663283oiw.3 for ; Tue, 07 Sep 2021 22:29:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=lW2xj3CbkupNbboVaqs0ID/AwXh7OZxgt0XcZ/wXyZg=; b=Ohzrxmq5D7uKhggNuxSfDQlmEmygpmiUzqaJBksYd4ytVlK87zRGmxkQZL05mXmCU1 k1FU9LYZ2S0T/9oghU7yohR9J1cS6SrKs6uIlw9DCMtg2ThkQhIY0ohESzxoceBDAIPs 1eVq2sjAKPKIHhVY1dYpCUuycmzx4l1FNWdfFEh93gzS2Nc4xpqlCY//pyWDOcCcxg2Y nDjiCagiwGIGdG749hXYb1MxwOtAvhGfij1ezGCp2qr5ZV+9y+5N+LyCXvYPrJ9T888y EVbUTKSay3AEPDAI5lpciAbSK55Wvd/ubY5T660w4ezNvNAIr3mO4s58rIPT3t30do9d 8YAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=lW2xj3CbkupNbboVaqs0ID/AwXh7OZxgt0XcZ/wXyZg=; b=GeLXPmTFg2ql+zDV1EmzkFPhqspEn5b3mG+6mgfKyC4gDQYK8waegIO/GTfkLXlcd6 /qWJAEUsHDiV+TG5ye8EFc+9eRedgqQ4n+FVxpl9F+SXC2aSqGtHTcuojoQEvzX8SO3H JU8Kev8vLWuWvzzmsr1Xj0+yx9PEB/Q8PXB/ORuywhuE/AAYKxyMIFT4ibYIlW6jwe5W SuTh5wp2yyOHiII8JKc2mcd9TEkYVJGU+vSuWT1BFpc4awihcV8U+DsuiZu0N3lWR/B/ cHgv3Q3O3aMefWDVLpj3QZcBKWN9pAKpgVv1efDdCQxzOtqq5Pz0kJXb2vE5VFeWyJyk 0PGA== X-Gm-Message-State: AOAM53340ib96GAsmkfbtmtCDxlpt6i5pZJJFiNO9eTewLoVJYpioiCt mYaMe3zYj11puBp557G17aE= X-Google-Smtp-Source: ABdhPJzbLlfQ1DVRRc9uHnhZatLDEZQAbnefyQz9izEHZ81FFOoqxjnK9/RdiH4c8AumN0XNg3E2jw== X-Received: by 2002:a05:6808:618:: with SMTP id y24mr1165472oih.179.1631078978411; Tue, 07 Sep 2021 22:29:38 -0700 (PDT) Received: from ubunto-21.tx.rr.com (2603-8081-140c-1a00-4049-a9c6-d3dc-35fa.res6.spectrum.com. [2603:8081:140c:1a00:4049:a9c6:d3dc:35fa]) by smtp.gmail.com with ESMTPSA id bf6sm281183oib.0.2021.09.07.22.29.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 Sep 2021 22:29:38 -0700 (PDT) From: Bob Pearson To: jgg@nvidia.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org, bvanassche@acm.org Cc: Bob Pearson Subject: [PATCH for-next v2 1/5] RDMA/rxe: Add memory barriers to kernel queues Date: Wed, 8 Sep 2021 00:29:24 -0500 Message-Id: <20210908052928.17375-2-rpearsonhpe@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210908052928.17375-1-rpearsonhpe@gmail.com> References: <20210908052928.17375-1-rpearsonhpe@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Earlier patches added memory barriers to protect user space to kernel space communications. This patch extends that to the case where queues are used between kernel space threads. Signed-off-by: Bob Pearson --- v2: Rebase on version 5.14. drivers/infiniband/sw/rxe/rxe_comp.c | 10 +--- drivers/infiniband/sw/rxe/rxe_cq.c | 25 ++------- drivers/infiniband/sw/rxe/rxe_qp.c | 10 ++-- drivers/infiniband/sw/rxe/rxe_queue.h | 73 ++++++++------------------- drivers/infiniband/sw/rxe/rxe_req.c | 21 ++------ drivers/infiniband/sw/rxe/rxe_resp.c | 38 ++++---------- drivers/infiniband/sw/rxe/rxe_srq.c | 2 +- drivers/infiniband/sw/rxe/rxe_verbs.c | 53 ++++--------------- 8 files changed, 55 insertions(+), 177 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c index d2d802c776fd..ed4e3f29bd65 100644 --- a/drivers/infiniband/sw/rxe/rxe_comp.c +++ b/drivers/infiniband/sw/rxe/rxe_comp.c @@ -142,10 +142,7 @@ static inline enum comp_state get_wqe(struct rxe_qp *qp, /* we come here whether or not we found a response packet to see if * there are any posted WQEs */ - if (qp->is_user) - wqe = queue_head(qp->sq.queue, QUEUE_TYPE_FROM_USER); - else - wqe = queue_head(qp->sq.queue, QUEUE_TYPE_KERNEL); + wqe = queue_head(qp->sq.queue, QUEUE_TYPE_FROM_CLIENT); *wqe_p = wqe; /* no WQE or requester has not started it yet */ @@ -432,10 +429,7 @@ static void do_complete(struct rxe_qp *qp, struct rxe_send_wqe *wqe) if (post) make_send_cqe(qp, wqe, &cqe); - if (qp->is_user) - advance_consumer(qp->sq.queue, QUEUE_TYPE_FROM_USER); - else - advance_consumer(qp->sq.queue, QUEUE_TYPE_KERNEL); + advance_consumer(qp->sq.queue, QUEUE_TYPE_FROM_CLIENT); if (post) rxe_cq_post(qp->scq, &cqe, 0); diff --git a/drivers/infiniband/sw/rxe/rxe_cq.c b/drivers/infiniband/sw/rxe/rxe_cq.c index aef288f164fd..4e26c2ea4a59 100644 --- a/drivers/infiniband/sw/rxe/rxe_cq.c +++ b/drivers/infiniband/sw/rxe/rxe_cq.c @@ -25,11 +25,7 @@ int rxe_cq_chk_attr(struct rxe_dev *rxe, struct rxe_cq *cq, } if (cq) { - if (cq->is_user) - count = queue_count(cq->queue, QUEUE_TYPE_TO_USER); - else - count = queue_count(cq->queue, QUEUE_TYPE_KERNEL); - + count = queue_count(cq->queue, QUEUE_TYPE_TO_CLIENT); if (cqe < count) { pr_warn("cqe(%d) < current # elements in queue (%d)", cqe, count); @@ -65,7 +61,7 @@ int rxe_cq_from_init(struct rxe_dev *rxe, struct rxe_cq *cq, int cqe, int err; enum queue_type type; - type = uresp ? QUEUE_TYPE_TO_USER : QUEUE_TYPE_KERNEL; + type = QUEUE_TYPE_TO_CLIENT; cq->queue = rxe_queue_init(rxe, &cqe, sizeof(struct rxe_cqe), type); if (!cq->queue) { @@ -117,11 +113,7 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited) spin_lock_irqsave(&cq->cq_lock, flags); - if (cq->is_user) - full = queue_full(cq->queue, QUEUE_TYPE_TO_USER); - else - full = queue_full(cq->queue, QUEUE_TYPE_KERNEL); - + full = queue_full(cq->queue, QUEUE_TYPE_TO_CLIENT); if (unlikely(full)) { spin_unlock_irqrestore(&cq->cq_lock, flags); if (cq->ibcq.event_handler) { @@ -134,17 +126,10 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited) return -EBUSY; } - if (cq->is_user) - addr = producer_addr(cq->queue, QUEUE_TYPE_TO_USER); - else - addr = producer_addr(cq->queue, QUEUE_TYPE_KERNEL); - + addr = producer_addr(cq->queue, QUEUE_TYPE_TO_CLIENT); memcpy(addr, cqe, sizeof(*cqe)); - if (cq->is_user) - advance_producer(cq->queue, QUEUE_TYPE_TO_USER); - else - advance_producer(cq->queue, QUEUE_TYPE_KERNEL); + advance_producer(cq->queue, QUEUE_TYPE_TO_CLIENT); spin_unlock_irqrestore(&cq->cq_lock, flags); diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c index 1ab6af7ddb25..2e923af642f8 100644 --- a/drivers/infiniband/sw/rxe/rxe_qp.c +++ b/drivers/infiniband/sw/rxe/rxe_qp.c @@ -231,7 +231,7 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp, qp->sq.max_inline = init->cap.max_inline_data = wqe_size; wqe_size += sizeof(struct rxe_send_wqe); - type = uresp ? QUEUE_TYPE_FROM_USER : QUEUE_TYPE_KERNEL; + type = QUEUE_TYPE_FROM_CLIENT; qp->sq.queue = rxe_queue_init(rxe, &qp->sq.max_wr, wqe_size, type); if (!qp->sq.queue) @@ -248,12 +248,8 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp, return err; } - if (qp->is_user) qp->req.wqe_index = producer_index(qp->sq.queue, - QUEUE_TYPE_FROM_USER); - else - qp->req.wqe_index = producer_index(qp->sq.queue, - QUEUE_TYPE_KERNEL); + QUEUE_TYPE_FROM_CLIENT); qp->req.state = QP_STATE_RESET; qp->req.opcode = -1; @@ -293,7 +289,7 @@ static int rxe_qp_init_resp(struct rxe_dev *rxe, struct rxe_qp *qp, pr_debug("qp#%d max_wr = %d, max_sge = %d, wqe_size = %d\n", qp_num(qp), qp->rq.max_wr, qp->rq.max_sge, wqe_size); - type = uresp ? QUEUE_TYPE_FROM_USER : QUEUE_TYPE_KERNEL; + type = QUEUE_TYPE_FROM_CLIENT; qp->rq.queue = rxe_queue_init(rxe, &qp->rq.max_wr, wqe_size, type); if (!qp->rq.queue) diff --git a/drivers/infiniband/sw/rxe/rxe_queue.h b/drivers/infiniband/sw/rxe/rxe_queue.h index 2702b0e55fc3..d465aa9342e1 100644 --- a/drivers/infiniband/sw/rxe/rxe_queue.h +++ b/drivers/infiniband/sw/rxe/rxe_queue.h @@ -35,9 +35,8 @@ /* type of queue */ enum queue_type { - QUEUE_TYPE_KERNEL, - QUEUE_TYPE_TO_USER, - QUEUE_TYPE_FROM_USER, + QUEUE_TYPE_TO_CLIENT, + QUEUE_TYPE_FROM_CLIENT, }; struct rxe_queue { @@ -87,20 +86,16 @@ static inline int queue_empty(struct rxe_queue *q, enum queue_type type) u32 cons; switch (type) { - case QUEUE_TYPE_FROM_USER: + case QUEUE_TYPE_FROM_CLIENT: /* protect user space index */ prod = smp_load_acquire(&q->buf->producer_index); cons = q->index; break; - case QUEUE_TYPE_TO_USER: + case QUEUE_TYPE_TO_CLIENT: prod = q->index; /* protect user space index */ cons = smp_load_acquire(&q->buf->consumer_index); break; - case QUEUE_TYPE_KERNEL: - prod = q->buf->producer_index; - cons = q->buf->consumer_index; - break; } return ((prod - cons) & q->index_mask) == 0; @@ -112,20 +107,16 @@ static inline int queue_full(struct rxe_queue *q, enum queue_type type) u32 cons; switch (type) { - case QUEUE_TYPE_FROM_USER: + case QUEUE_TYPE_FROM_CLIENT: /* protect user space index */ prod = smp_load_acquire(&q->buf->producer_index); cons = q->index; break; - case QUEUE_TYPE_TO_USER: + case QUEUE_TYPE_TO_CLIENT: prod = q->index; /* protect user space index */ cons = smp_load_acquire(&q->buf->consumer_index); break; - case QUEUE_TYPE_KERNEL: - prod = q->buf->producer_index; - cons = q->buf->consumer_index; - break; } return ((prod + 1 - cons) & q->index_mask) == 0; @@ -138,20 +129,16 @@ static inline unsigned int queue_count(const struct rxe_queue *q, u32 cons; switch (type) { - case QUEUE_TYPE_FROM_USER: + case QUEUE_TYPE_FROM_CLIENT: /* protect user space index */ prod = smp_load_acquire(&q->buf->producer_index); cons = q->index; break; - case QUEUE_TYPE_TO_USER: + case QUEUE_TYPE_TO_CLIENT: prod = q->index; /* protect user space index */ cons = smp_load_acquire(&q->buf->consumer_index); break; - case QUEUE_TYPE_KERNEL: - prod = q->buf->producer_index; - cons = q->buf->consumer_index; - break; } return (prod - cons) & q->index_mask; @@ -162,7 +149,7 @@ static inline void advance_producer(struct rxe_queue *q, enum queue_type type) u32 prod; switch (type) { - case QUEUE_TYPE_FROM_USER: + case QUEUE_TYPE_FROM_CLIENT: pr_warn_once("Normally kernel should not write user space index\n"); /* protect user space index */ prod = smp_load_acquire(&q->buf->producer_index); @@ -170,15 +157,11 @@ static inline void advance_producer(struct rxe_queue *q, enum queue_type type) /* same */ smp_store_release(&q->buf->producer_index, prod); break; - case QUEUE_TYPE_TO_USER: + case QUEUE_TYPE_TO_CLIENT: prod = q->index; q->index = (prod + 1) & q->index_mask; q->buf->producer_index = q->index; break; - case QUEUE_TYPE_KERNEL: - prod = q->buf->producer_index; - q->buf->producer_index = (prod + 1) & q->index_mask; - break; } } @@ -187,12 +170,12 @@ static inline void advance_consumer(struct rxe_queue *q, enum queue_type type) u32 cons; switch (type) { - case QUEUE_TYPE_FROM_USER: + case QUEUE_TYPE_FROM_CLIENT: cons = q->index; q->index = (cons + 1) & q->index_mask; q->buf->consumer_index = q->index; break; - case QUEUE_TYPE_TO_USER: + case QUEUE_TYPE_TO_CLIENT: pr_warn_once("Normally kernel should not write user space index\n"); /* protect user space index */ cons = smp_load_acquire(&q->buf->consumer_index); @@ -200,10 +183,6 @@ static inline void advance_consumer(struct rxe_queue *q, enum queue_type type) /* same */ smp_store_release(&q->buf->consumer_index, cons); break; - case QUEUE_TYPE_KERNEL: - cons = q->buf->consumer_index; - q->buf->consumer_index = (cons + 1) & q->index_mask; - break; } } @@ -212,17 +191,14 @@ static inline void *producer_addr(struct rxe_queue *q, enum queue_type type) u32 prod; switch (type) { - case QUEUE_TYPE_FROM_USER: + case QUEUE_TYPE_FROM_CLIENT: /* protect user space index */ prod = smp_load_acquire(&q->buf->producer_index); prod &= q->index_mask; break; - case QUEUE_TYPE_TO_USER: + case QUEUE_TYPE_TO_CLIENT: prod = q->index; break; - case QUEUE_TYPE_KERNEL: - prod = q->buf->producer_index; - break; } return q->buf->data + (prod << q->log2_elem_size); @@ -233,17 +209,14 @@ static inline void *consumer_addr(struct rxe_queue *q, enum queue_type type) u32 cons; switch (type) { - case QUEUE_TYPE_FROM_USER: + case QUEUE_TYPE_FROM_CLIENT: cons = q->index; break; - case QUEUE_TYPE_TO_USER: + case QUEUE_TYPE_TO_CLIENT: /* protect user space index */ cons = smp_load_acquire(&q->buf->consumer_index); cons &= q->index_mask; break; - case QUEUE_TYPE_KERNEL: - cons = q->buf->consumer_index; - break; } return q->buf->data + (cons << q->log2_elem_size); @@ -255,17 +228,14 @@ static inline unsigned int producer_index(struct rxe_queue *q, u32 prod; switch (type) { - case QUEUE_TYPE_FROM_USER: + case QUEUE_TYPE_FROM_CLIENT: /* protect user space index */ prod = smp_load_acquire(&q->buf->producer_index); prod &= q->index_mask; break; - case QUEUE_TYPE_TO_USER: + case QUEUE_TYPE_TO_CLIENT: prod = q->index; break; - case QUEUE_TYPE_KERNEL: - prod = q->buf->producer_index; - break; } return prod; @@ -277,17 +247,14 @@ static inline unsigned int consumer_index(struct rxe_queue *q, u32 cons; switch (type) { - case QUEUE_TYPE_FROM_USER: + case QUEUE_TYPE_FROM_CLIENT: cons = q->index; break; - case QUEUE_TYPE_TO_USER: + case QUEUE_TYPE_TO_CLIENT: /* protect user space index */ cons = smp_load_acquire(&q->buf->consumer_index); cons &= q->index_mask; break; - case QUEUE_TYPE_KERNEL: - cons = q->buf->consumer_index; - break; } return cons; diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c index 3894197a82f6..22c3edb28945 100644 --- a/drivers/infiniband/sw/rxe/rxe_req.c +++ b/drivers/infiniband/sw/rxe/rxe_req.c @@ -49,13 +49,8 @@ static void req_retry(struct rxe_qp *qp) unsigned int cons; unsigned int prod; - if (qp->is_user) { - cons = consumer_index(q, QUEUE_TYPE_FROM_USER); - prod = producer_index(q, QUEUE_TYPE_FROM_USER); - } else { - cons = consumer_index(q, QUEUE_TYPE_KERNEL); - prod = producer_index(q, QUEUE_TYPE_KERNEL); - } + cons = consumer_index(q, QUEUE_TYPE_FROM_CLIENT); + prod = producer_index(q, QUEUE_TYPE_FROM_CLIENT); qp->req.wqe_index = cons; qp->req.psn = qp->comp.psn; @@ -121,15 +116,9 @@ static struct rxe_send_wqe *req_next_wqe(struct rxe_qp *qp) unsigned int cons; unsigned int prod; - if (qp->is_user) { - wqe = queue_head(q, QUEUE_TYPE_FROM_USER); - cons = consumer_index(q, QUEUE_TYPE_FROM_USER); - prod = producer_index(q, QUEUE_TYPE_FROM_USER); - } else { - wqe = queue_head(q, QUEUE_TYPE_KERNEL); - cons = consumer_index(q, QUEUE_TYPE_KERNEL); - prod = producer_index(q, QUEUE_TYPE_KERNEL); - } + wqe = queue_head(q, QUEUE_TYPE_FROM_CLIENT); + cons = consumer_index(q, QUEUE_TYPE_FROM_CLIENT); + prod = producer_index(q, QUEUE_TYPE_FROM_CLIENT); if (unlikely(qp->req.state == QP_STATE_DRAIN)) { /* check to see if we are drained; diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c index 5501227ddc65..596be002d33d 100644 --- a/drivers/infiniband/sw/rxe/rxe_resp.c +++ b/drivers/infiniband/sw/rxe/rxe_resp.c @@ -303,10 +303,7 @@ static enum resp_states get_srq_wqe(struct rxe_qp *qp) spin_lock_bh(&srq->rq.consumer_lock); - if (qp->is_user) - wqe = queue_head(q, QUEUE_TYPE_FROM_USER); - else - wqe = queue_head(q, QUEUE_TYPE_KERNEL); + wqe = queue_head(q, QUEUE_TYPE_FROM_CLIENT); if (!wqe) { spin_unlock_bh(&srq->rq.consumer_lock); return RESPST_ERR_RNR; @@ -322,13 +319,8 @@ static enum resp_states get_srq_wqe(struct rxe_qp *qp) memcpy(&qp->resp.srq_wqe, wqe, size); qp->resp.wqe = &qp->resp.srq_wqe.wqe; - if (qp->is_user) { - advance_consumer(q, QUEUE_TYPE_FROM_USER); - count = queue_count(q, QUEUE_TYPE_FROM_USER); - } else { - advance_consumer(q, QUEUE_TYPE_KERNEL); - count = queue_count(q, QUEUE_TYPE_KERNEL); - } + advance_consumer(q, QUEUE_TYPE_FROM_CLIENT); + count = queue_count(q, QUEUE_TYPE_FROM_CLIENT); if (srq->limit && srq->ibsrq.event_handler && (count < srq->limit)) { srq->limit = 0; @@ -357,12 +349,8 @@ static enum resp_states check_resource(struct rxe_qp *qp, qp->resp.status = IB_WC_WR_FLUSH_ERR; return RESPST_COMPLETE; } else if (!srq) { - if (qp->is_user) - qp->resp.wqe = queue_head(qp->rq.queue, - QUEUE_TYPE_FROM_USER); - else - qp->resp.wqe = queue_head(qp->rq.queue, - QUEUE_TYPE_KERNEL); + qp->resp.wqe = queue_head(qp->rq.queue, + QUEUE_TYPE_FROM_CLIENT); if (qp->resp.wqe) { qp->resp.status = IB_WC_WR_FLUSH_ERR; return RESPST_COMPLETE; @@ -389,12 +377,8 @@ static enum resp_states check_resource(struct rxe_qp *qp, if (srq) return get_srq_wqe(qp); - if (qp->is_user) - qp->resp.wqe = queue_head(qp->rq.queue, - QUEUE_TYPE_FROM_USER); - else - qp->resp.wqe = queue_head(qp->rq.queue, - QUEUE_TYPE_KERNEL); + qp->resp.wqe = queue_head(qp->rq.queue, + QUEUE_TYPE_FROM_CLIENT); return (qp->resp.wqe) ? RESPST_CHK_LENGTH : RESPST_ERR_RNR; } @@ -936,12 +920,8 @@ static enum resp_states do_complete(struct rxe_qp *qp, } /* have copy for srq and reference for !srq */ - if (!qp->srq) { - if (qp->is_user) - advance_consumer(qp->rq.queue, QUEUE_TYPE_FROM_USER); - else - advance_consumer(qp->rq.queue, QUEUE_TYPE_KERNEL); - } + if (!qp->srq) + advance_consumer(qp->rq.queue, QUEUE_TYPE_FROM_CLIENT); qp->resp.wqe = NULL; diff --git a/drivers/infiniband/sw/rxe/rxe_srq.c b/drivers/infiniband/sw/rxe/rxe_srq.c index 610c98d24b5c..a9e7817e2732 100644 --- a/drivers/infiniband/sw/rxe/rxe_srq.c +++ b/drivers/infiniband/sw/rxe/rxe_srq.c @@ -93,7 +93,7 @@ int rxe_srq_from_init(struct rxe_dev *rxe, struct rxe_srq *srq, spin_lock_init(&srq->rq.producer_lock); spin_lock_init(&srq->rq.consumer_lock); - type = uresp ? QUEUE_TYPE_FROM_USER : QUEUE_TYPE_KERNEL; + type = QUEUE_TYPE_FROM_CLIENT; q = rxe_queue_init(rxe, &srq->rq.max_wr, srq_wqe_size, type); if (!q) { diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c index 267b5a9c345d..dc70e3edeba6 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.c +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c @@ -218,11 +218,7 @@ static int post_one_recv(struct rxe_rq *rq, const struct ib_recv_wr *ibwr) int num_sge = ibwr->num_sge; int full; - if (rq->is_user) - full = queue_full(rq->queue, QUEUE_TYPE_FROM_USER); - else - full = queue_full(rq->queue, QUEUE_TYPE_KERNEL); - + full = queue_full(rq->queue, QUEUE_TYPE_FROM_CLIENT); if (unlikely(full)) { err = -ENOMEM; goto err1; @@ -237,11 +233,7 @@ static int post_one_recv(struct rxe_rq *rq, const struct ib_recv_wr *ibwr) for (i = 0; i < num_sge; i++) length += ibwr->sg_list[i].length; - if (rq->is_user) - recv_wqe = producer_addr(rq->queue, QUEUE_TYPE_FROM_USER); - else - recv_wqe = producer_addr(rq->queue, QUEUE_TYPE_KERNEL); - + recv_wqe = producer_addr(rq->queue, QUEUE_TYPE_FROM_CLIENT); recv_wqe->wr_id = ibwr->wr_id; recv_wqe->num_sge = num_sge; @@ -254,10 +246,7 @@ static int post_one_recv(struct rxe_rq *rq, const struct ib_recv_wr *ibwr) recv_wqe->dma.cur_sge = 0; recv_wqe->dma.sge_offset = 0; - if (rq->is_user) - advance_producer(rq->queue, QUEUE_TYPE_FROM_USER); - else - advance_producer(rq->queue, QUEUE_TYPE_KERNEL); + advance_producer(rq->queue, QUEUE_TYPE_FROM_CLIENT); return 0; @@ -633,27 +622,17 @@ static int post_one_send(struct rxe_qp *qp, const struct ib_send_wr *ibwr, spin_lock_irqsave(&qp->sq.sq_lock, flags); - if (qp->is_user) - full = queue_full(sq->queue, QUEUE_TYPE_FROM_USER); - else - full = queue_full(sq->queue, QUEUE_TYPE_KERNEL); + full = queue_full(sq->queue, QUEUE_TYPE_FROM_CLIENT); if (unlikely(full)) { spin_unlock_irqrestore(&qp->sq.sq_lock, flags); return -ENOMEM; } - if (qp->is_user) - send_wqe = producer_addr(sq->queue, QUEUE_TYPE_FROM_USER); - else - send_wqe = producer_addr(sq->queue, QUEUE_TYPE_KERNEL); - + send_wqe = producer_addr(sq->queue, QUEUE_TYPE_FROM_CLIENT); init_send_wqe(qp, ibwr, mask, length, send_wqe); - if (qp->is_user) - advance_producer(sq->queue, QUEUE_TYPE_FROM_USER); - else - advance_producer(sq->queue, QUEUE_TYPE_KERNEL); + advance_producer(sq->queue, QUEUE_TYPE_FROM_CLIENT); spin_unlock_irqrestore(&qp->sq.sq_lock, flags); @@ -845,18 +824,12 @@ static int rxe_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc) spin_lock_irqsave(&cq->cq_lock, flags); for (i = 0; i < num_entries; i++) { - if (cq->is_user) - cqe = queue_head(cq->queue, QUEUE_TYPE_TO_USER); - else - cqe = queue_head(cq->queue, QUEUE_TYPE_KERNEL); + cqe = queue_head(cq->queue, QUEUE_TYPE_TO_CLIENT); if (!cqe) break; memcpy(wc++, &cqe->ibwc, sizeof(*wc)); - if (cq->is_user) - advance_consumer(cq->queue, QUEUE_TYPE_TO_USER); - else - advance_consumer(cq->queue, QUEUE_TYPE_KERNEL); + advance_consumer(cq->queue, QUEUE_TYPE_TO_CLIENT); } spin_unlock_irqrestore(&cq->cq_lock, flags); @@ -868,10 +841,7 @@ static int rxe_peek_cq(struct ib_cq *ibcq, int wc_cnt) struct rxe_cq *cq = to_rcq(ibcq); int count; - if (cq->is_user) - count = queue_count(cq->queue, QUEUE_TYPE_TO_USER); - else - count = queue_count(cq->queue, QUEUE_TYPE_KERNEL); + count = queue_count(cq->queue, QUEUE_TYPE_TO_CLIENT); return (count > wc_cnt) ? wc_cnt : count; } @@ -887,10 +857,7 @@ static int rxe_req_notify_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags) if (cq->notify != IB_CQ_NEXT_COMP) cq->notify = flags & IB_CQ_SOLICITED_MASK; - if (cq->is_user) - empty = queue_empty(cq->queue, QUEUE_TYPE_TO_USER); - else - empty = queue_empty(cq->queue, QUEUE_TYPE_KERNEL); + empty = queue_empty(cq->queue, QUEUE_TYPE_TO_CLIENT); if ((flags & IB_CQ_REPORT_MISSED_EVENTS) && !empty) ret = 1; From patchwork Wed Sep 8 05:29:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bob Pearson X-Patchwork-Id: 12480159 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 53DC7C4332F for ; Wed, 8 Sep 2021 05:29:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3879261163 for ; Wed, 8 Sep 2021 05:29:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244599AbhIHFav (ORCPT ); Wed, 8 Sep 2021 01:30:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49160 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343763AbhIHFaq (ORCPT ); Wed, 8 Sep 2021 01:30:46 -0400 Received: from mail-oo1-xc2a.google.com (mail-oo1-xc2a.google.com [IPv6:2607:f8b0:4864:20::c2a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6FDB8C061796 for ; Tue, 7 Sep 2021 22:29:39 -0700 (PDT) Received: by mail-oo1-xc2a.google.com with SMTP id q26-20020a4adc5a000000b002918a69c8eeso389185oov.13 for ; Tue, 07 Sep 2021 22:29:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=9IfYMPJpGuyTJPl37XggKkTRsOxDAvBdRXK0FPdpZL8=; b=k/97KdxnUTinbEiQT2Hywv6Q8EP9tdjwY/bVz1yBpyadQL0SWWlkJFCVZuzpgr0D/n OfpDgIvC2LWTpcbXl9PIQ3C77TIXY9zPDiwNGWaaTVWEAkEyLnPiVA1Qodeq/e6lcC1V cadq6sNAsObeYu3FjmzoDErVFXvfFWDx3tFJrBAocAfK9mCB73R1UQnWES3VaKEK0Ykp j+Q+Vg7OcJx/YToPvINtho7y2bitsF9SQ1D/FIgWjJbJS0gx/o1JcP6C/XtAilu9zGuX UVdEz97uaURjkbYdI+2Gr3fNbBxF3YfNKV3of2JTIzHtCJhqVt5zss1mf/jAoMI08+ws 0TBA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=9IfYMPJpGuyTJPl37XggKkTRsOxDAvBdRXK0FPdpZL8=; b=sioPkv+iqnzTh2bsYpit6WWOznbFbJN+fNWDc37TXpzd/nL/IMg6ojJwVIcQN49b/I 73wNh4iZmL1xZ1hEyE+zt0ewouiazR/vquRmJbWZVIXi1+3x0LyYXwytbph7x8JwTM2o S9rdNIM8gJMQVWqU550mSHgZ0KbJ4ZarW/tuazHnjV90cWB/cnBOk8Iyr2TsT98gk5in 2kk1i/2a3DzGQqTpB7aDMN1e3jEHW07NrkihhR03NGJs30ZnzUWdWFqRjxVCqxyU1AO+ KK+mLuL51s9E2NSOAbCWceu/TGFUTdoAr8i/UcEcFRv+Ks6gBKqR7GRAZmxrTJy/ib8r 1Ekg== X-Gm-Message-State: AOAM530MGkj9bpuWhbmhjID3VpeceaTLlVOE/BzTeOwmZLxRLSVg0ZLP eYzFnyTc4S2wp/MG/D2WCP0WvjypJorJLQ== X-Google-Smtp-Source: ABdhPJwPVgWkCH64pqIMqy2x3hUsqsns6gfjkr1zAUvyDoNR48tRFksqVllZlbp7DJZs5lMYhBcFmQ== X-Received: by 2002:a05:6820:555:: with SMTP id n21mr1505780ooj.56.1631078978914; Tue, 07 Sep 2021 22:29:38 -0700 (PDT) Received: from ubunto-21.tx.rr.com (2603-8081-140c-1a00-4049-a9c6-d3dc-35fa.res6.spectrum.com. [2603:8081:140c:1a00:4049:a9c6:d3dc:35fa]) by smtp.gmail.com with ESMTPSA id bf6sm281183oib.0.2021.09.07.22.29.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 Sep 2021 22:29:38 -0700 (PDT) From: Bob Pearson To: jgg@nvidia.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org, bvanassche@acm.org Cc: Bob Pearson , Dan Carpenter Subject: [PATCH for-next v2 2/5] RDMA/rxe: Fix memory allocation while locked Date: Wed, 8 Sep 2021 00:29:25 -0500 Message-Id: <20210908052928.17375-3-rpearsonhpe@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210908052928.17375-1-rpearsonhpe@gmail.com> References: <20210908052928.17375-1-rpearsonhpe@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org rxe_mcast_add_grp_elem() in rxe_mcast.c calls rxe_alloc() while holding spinlocks which in turn calls kzalloc(size, GFP_KERNEL) which is incorrect. This patch replaces rxe_alloc() by rxe_alloc_locked() which uses GFP_ATOMIC. This bug was caused by the below mentioned commit and failing to handle the need for the atomic allocate. Fixes: 4276fd0dddc9 ("Remove RXE_POOL_ATOMIC") Reported-by: Dan Carpenter Signed-off-by: Bob Pearson --- v2: rebase on version 5.14 drivers/infiniband/sw/rxe/rxe_mcast.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/infiniband/sw/rxe/rxe_mcast.c b/drivers/infiniband/sw/rxe/rxe_mcast.c index 0ea9a5aa4ec0..1c1d1b53312d 100644 --- a/drivers/infiniband/sw/rxe/rxe_mcast.c +++ b/drivers/infiniband/sw/rxe/rxe_mcast.c @@ -85,7 +85,7 @@ int rxe_mcast_add_grp_elem(struct rxe_dev *rxe, struct rxe_qp *qp, goto out; } - elem = rxe_alloc(&rxe->mc_elem_pool); + elem = rxe_alloc_locked(&rxe->mc_elem_pool); if (!elem) { err = -ENOMEM; goto out; From patchwork Wed Sep 8 05:29:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bob Pearson X-Patchwork-Id: 12480163 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1188C433F5 for ; Wed, 8 Sep 2021 05:29:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C503661158 for ; Wed, 8 Sep 2021 05:29:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245425AbhIHFav (ORCPT ); Wed, 8 Sep 2021 01:30:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49164 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343869AbhIHFar (ORCPT ); Wed, 8 Sep 2021 01:30:47 -0400 Received: from mail-ot1-x32f.google.com (mail-ot1-x32f.google.com [IPv6:2607:f8b0:4864:20::32f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0A860C06179A for ; Tue, 7 Sep 2021 22:29:40 -0700 (PDT) Received: by mail-ot1-x32f.google.com with SMTP id k12-20020a056830150c00b0051abe7f680bso1472702otp.1 for ; Tue, 07 Sep 2021 22:29:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=fUZA8Yh9YkaOYDHRfprt/esuD7mOi3N+eX2I537SPtQ=; b=iD6OVh1BgJmpZrzAiKAarOAwtJ46qho03Y/WT8tPdX7CtOQsB5+X1XBXw8tUrN/Onu Y/JeKocJz0EzVteWJjljm/0I3iuMsQP+KBb/bvvm/yafvFfBSGvln/w9eyMdo08nojop m0POcTenEInjh8HNAJf6kdvjwOD3jyaAi4kTsgZovw1uxkiHtZEJ1Q310JsoEqK7On0p naVsJyYp+0SyUmmLL4yQ97x5aXeNaFDV6ylK+WiOr49RTkr2G4Lq27VltLIOgrcietnE yxu/y5buuQNGjJaiFVudgu9vTENNjR/4RMmq6DNh2b1NiVhjoASVB+kmmXdPUXKUsS0Z It0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=fUZA8Yh9YkaOYDHRfprt/esuD7mOi3N+eX2I537SPtQ=; b=KBDQ8DxWDSQ9dHJgWXP7o8oH3aQRzdNa6wrtV39RJ8AtRKeo8ZuX00ilExNDg/MfCf s0+NLJmWrbcPuqj5uXXmc+OQQF8bFwvt+PYX1ZM5RZ4KB8xATPcDbhjO6ZNlaQtMJ1BH DnL7ucAyEREvst1IvoKGsleEqSE2nQmnvq8PXbN8UzR60h+sfIoVKIqOHf/UQO8fdzGD gXHuTYAwdlxgQPdYR0eFVw9Eo/pjT+87/B0KNIc/nixzdhJ+L3z8aEoHr9pIIBlDynkZ w9J9K5Mt4BnW4CSJRw/pz0zdD0sxkjhq4+lfZ2NYQlEtGtnjhWLjNSoyBvKT+EmSyZmD GvYA== X-Gm-Message-State: AOAM532BZWHE5fxTf4RYSDrudW5oVMbxJF8jPKlVdURC9fzglnmGuseH rZ6d8Sl6h+0qQ12oIgaDDIc= X-Google-Smtp-Source: ABdhPJzF8Sjj5KCqPXDJCIcZxh3dKcO2/5b8He5hTTK6ojRzgLbCNNWFCGiaBYLMvq0dI1M58TAk1A== X-Received: by 2002:a9d:450c:: with SMTP id w12mr1640450ote.18.1631078979398; Tue, 07 Sep 2021 22:29:39 -0700 (PDT) Received: from ubunto-21.tx.rr.com (2603-8081-140c-1a00-4049-a9c6-d3dc-35fa.res6.spectrum.com. [2603:8081:140c:1a00:4049:a9c6:d3dc:35fa]) by smtp.gmail.com with ESMTPSA id bf6sm281183oib.0.2021.09.07.22.29.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 Sep 2021 22:29:39 -0700 (PDT) From: Bob Pearson To: jgg@nvidia.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org, bvanassche@acm.org Cc: Bob Pearson Subject: [PATCH for-next 3/5] RDMA/rxe: Separate HW and SW l/rkeys Date: Wed, 8 Sep 2021 00:29:26 -0500 Message-Id: <20210908052928.17375-4-rpearsonhpe@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210908052928.17375-1-rpearsonhpe@gmail.com> References: <20210908052928.17375-1-rpearsonhpe@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Separate software and simulated hardware lkeys and rkeys for MRs and MWs. This makes struct ib_mr and struct ib_mw isolated from hardware changes triggered by executing work requests. Signed-off-by: Bob Pearson --- drivers/infiniband/sw/rxe/rxe_loc.h | 1 + drivers/infiniband/sw/rxe/rxe_mr.c | 69 ++++++++++++++++++++++----- drivers/infiniband/sw/rxe/rxe_mw.c | 30 ++++++------ drivers/infiniband/sw/rxe/rxe_req.c | 14 ++---- drivers/infiniband/sw/rxe/rxe_verbs.h | 18 ++----- 5 files changed, 81 insertions(+), 51 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h index f0c954575bde..4fd73b51fabf 100644 --- a/drivers/infiniband/sw/rxe/rxe_loc.h +++ b/drivers/infiniband/sw/rxe/rxe_loc.h @@ -86,6 +86,7 @@ struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key, int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length); int advance_dma_data(struct rxe_dma_info *dma, unsigned int length); int rxe_invalidate_mr(struct rxe_qp *qp, u32 rkey); +int rxe_reg_fast_mr(struct rxe_qp *qp, struct rxe_send_wqe *wqe); int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata); void rxe_mr_cleanup(struct rxe_pool_entry *arg); diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c index 5890a8246216..bedcf15aaea7 100644 --- a/drivers/infiniband/sw/rxe/rxe_mr.c +++ b/drivers/infiniband/sw/rxe/rxe_mr.c @@ -48,8 +48,14 @@ static void rxe_mr_init(int access, struct rxe_mr *mr) u32 lkey = mr->pelem.index << 8 | rxe_get_next_key(-1); u32 rkey = (access & IB_ACCESS_REMOTE) ? lkey : 0; - mr->ibmr.lkey = lkey; - mr->ibmr.rkey = rkey; + /* set ibmr->l/rkey and also copy into private l/rkey + * for user MRs these will always be the same + * for cases where caller 'owns' the key portion + * they may be different until REG_MR WQE is executed. + */ + mr->lkey = mr->ibmr.lkey = lkey; + mr->rkey = mr->ibmr.rkey = rkey; + mr->state = RXE_MR_STATE_INVALID; mr->type = RXE_MR_TYPE_NONE; mr->map_shift = ilog2(RXE_BUF_PER_MAP); @@ -191,10 +197,8 @@ int rxe_mr_init_fast(struct rxe_pd *pd, int max_pages, struct rxe_mr *mr) { int err; - rxe_mr_init(0, mr); - - /* In fastreg, we also set the rkey */ - mr->ibmr.rkey = mr->ibmr.lkey; + /* always allow remote access for FMRs */ + rxe_mr_init(IB_ACCESS_REMOTE, mr); err = rxe_mr_alloc(mr, max_pages); if (err) @@ -507,8 +511,8 @@ struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key, if (!mr) return NULL; - if (unlikely((type == RXE_LOOKUP_LOCAL && mr_lkey(mr) != key) || - (type == RXE_LOOKUP_REMOTE && mr_rkey(mr) != key) || + if (unlikely((type == RXE_LOOKUP_LOCAL && mr->lkey != key) || + (type == RXE_LOOKUP_REMOTE && mr->rkey != key) || mr_pd(mr) != pd || (access && !(access & mr->access)) || mr->state != RXE_MR_STATE_VALID)) { rxe_drop_ref(mr); @@ -531,9 +535,9 @@ int rxe_invalidate_mr(struct rxe_qp *qp, u32 rkey) goto err; } - if (rkey != mr->ibmr.rkey) { - pr_err("%s: rkey (%#x) doesn't match mr->ibmr.rkey (%#x)\n", - __func__, rkey, mr->ibmr.rkey); + if (rkey != mr->rkey) { + pr_err("%s: rkey (%#x) doesn't match mr->rkey (%#x)\n", + __func__, rkey, mr->rkey); ret = -EINVAL; goto err_drop_ref; } @@ -554,6 +558,49 @@ int rxe_invalidate_mr(struct rxe_qp *qp, u32 rkey) return ret; } +/* user can (re)register fast MR by executing a REG_MR WQE. + * user is expected to hold a reference on the ib mr until the + * WQE completes. + * Once a fast MR is created this is the only way to change the + * private keys. It is the responsibility of the user to maintain + * the ib mr keys in sync with rxe mr keys. + */ +int rxe_reg_fast_mr(struct rxe_qp *qp, struct rxe_send_wqe *wqe) +{ + struct rxe_mr *mr = to_rmr(wqe->wr.wr.reg.mr); + u32 key = wqe->wr.wr.reg.key; + u32 access = wqe->wr.wr.reg.access; + + /* user can only register MR in free state */ + if (unlikely(mr->state != RXE_MR_STATE_FREE)) { + pr_warn("%s: mr->lkey = 0x%x not free\n", + __func__, mr->lkey); + return -EINVAL; + } + + /* user can only register mr with qp in same protection domain */ + if (unlikely(qp->ibqp.pd != mr->ibmr.pd)) { + pr_warn("%s: qp->pd and mr->pd don't match\n", + __func__); + return -EINVAL; + } + + /* user is only allowed to change key portion of l/rkey */ + if (unlikely((mr->lkey & ~0xff) != (key & ~0xff))) { + pr_warn("%s: key = 0x%x has wrong index mr->lkey = 0x%x\n", + __func__, key, mr->lkey); + return -EINVAL; + } + + mr->access = access; + mr->lkey = key; + mr->rkey = (access & IB_ACCESS_REMOTE) ? key : 0; + mr->iova = wqe->wr.wr.reg.mr->iova; + mr->state = RXE_MR_STATE_VALID; + + return 0; +} + int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) { struct rxe_mr *mr = to_rmr(ibmr); diff --git a/drivers/infiniband/sw/rxe/rxe_mw.c b/drivers/infiniband/sw/rxe/rxe_mw.c index 5ba77df7598e..a5e2ea7d80f0 100644 --- a/drivers/infiniband/sw/rxe/rxe_mw.c +++ b/drivers/infiniband/sw/rxe/rxe_mw.c @@ -21,7 +21,7 @@ int rxe_alloc_mw(struct ib_mw *ibmw, struct ib_udata *udata) } rxe_add_index(mw); - ibmw->rkey = (mw->pelem.index << 8) | rxe_get_next_key(-1); + mw->rkey = ibmw->rkey = (mw->pelem.index << 8) | rxe_get_next_key(-1); mw->state = (mw->ibmw.type == IB_MW_TYPE_2) ? RXE_MW_STATE_FREE : RXE_MW_STATE_VALID; spin_lock_init(&mw->lock); @@ -71,6 +71,8 @@ int rxe_dealloc_mw(struct ib_mw *ibmw) static int rxe_check_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe, struct rxe_mw *mw, struct rxe_mr *mr) { + u32 key = wqe->wr.wr.mw.rkey & 0xff; + if (mw->ibmw.type == IB_MW_TYPE_1) { if (unlikely(mw->state != RXE_MW_STATE_VALID)) { pr_err_once( @@ -108,7 +110,7 @@ static int rxe_check_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe, } } - if (unlikely((wqe->wr.wr.mw.rkey & 0xff) == (mw->ibmw.rkey & 0xff))) { + if (unlikely(key == (mw->rkey & 0xff))) { pr_err_once("attempt to bind MW with same key\n"); return -EINVAL; } @@ -161,13 +163,9 @@ static int rxe_check_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe, static void rxe_do_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe, struct rxe_mw *mw, struct rxe_mr *mr) { - u32 rkey; - u32 new_rkey; - - rkey = mw->ibmw.rkey; - new_rkey = (rkey & 0xffffff00) | (wqe->wr.wr.mw.rkey & 0x000000ff); + u32 key = wqe->wr.wr.mw.rkey & 0xff; - mw->ibmw.rkey = new_rkey; + mw->rkey = (mw->rkey & ~0xff) | key; mw->access = wqe->wr.wr.mw.access; mw->state = RXE_MW_STATE_VALID; mw->addr = wqe->wr.wr.mw.addr; @@ -197,29 +195,29 @@ int rxe_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe) struct rxe_mw *mw; struct rxe_mr *mr; struct rxe_dev *rxe = to_rdev(qp->ibqp.device); + u32 mw_rkey = wqe->wr.wr.mw.mw_rkey; + u32 mr_lkey = wqe->wr.wr.mw.mr_lkey; unsigned long flags; - mw = rxe_pool_get_index(&rxe->mw_pool, - wqe->wr.wr.mw.mw_rkey >> 8); + mw = rxe_pool_get_index(&rxe->mw_pool, mw_rkey >> 8); if (unlikely(!mw)) { ret = -EINVAL; goto err; } - if (unlikely(mw->ibmw.rkey != wqe->wr.wr.mw.mw_rkey)) { + if (unlikely(mw->rkey != mw_rkey)) { ret = -EINVAL; goto err_drop_mw; } if (likely(wqe->wr.wr.mw.length)) { - mr = rxe_pool_get_index(&rxe->mr_pool, - wqe->wr.wr.mw.mr_lkey >> 8); + mr = rxe_pool_get_index(&rxe->mr_pool, mr_lkey >> 8); if (unlikely(!mr)) { ret = -EINVAL; goto err_drop_mw; } - if (unlikely(mr->ibmr.lkey != wqe->wr.wr.mw.mr_lkey)) { + if (unlikely(mr->lkey != mr_lkey)) { ret = -EINVAL; goto err_drop_mr; } @@ -292,7 +290,7 @@ int rxe_invalidate_mw(struct rxe_qp *qp, u32 rkey) goto err; } - if (rkey != mw->ibmw.rkey) { + if (rkey != mw->rkey) { ret = -EINVAL; goto err_drop_ref; } @@ -323,7 +321,7 @@ struct rxe_mw *rxe_lookup_mw(struct rxe_qp *qp, int access, u32 rkey) if (!mw) return NULL; - if (unlikely((rxe_mw_rkey(mw) != rkey) || rxe_mw_pd(mw) != pd || + if (unlikely((mw->rkey != rkey) || rxe_mw_pd(mw) != pd || (mw->ibmw.type == IB_MW_TYPE_2 && mw->qp != qp) || (mw->length == 0) || (access && !(access & mw->access)) || diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c index 22c3edb28945..ac18dcd6905b 100644 --- a/drivers/infiniband/sw/rxe/rxe_req.c +++ b/drivers/infiniband/sw/rxe/rxe_req.c @@ -561,7 +561,6 @@ static void update_state(struct rxe_qp *qp, struct rxe_send_wqe *wqe, static int rxe_do_local_ops(struct rxe_qp *qp, struct rxe_send_wqe *wqe) { u8 opcode = wqe->wr.opcode; - struct rxe_mr *mr; u32 rkey; int ret; @@ -579,14 +578,11 @@ static int rxe_do_local_ops(struct rxe_qp *qp, struct rxe_send_wqe *wqe) } break; case IB_WR_REG_MR: - mr = to_rmr(wqe->wr.wr.reg.mr); - rxe_add_ref(mr); - mr->state = RXE_MR_STATE_VALID; - mr->access = wqe->wr.wr.reg.access; - mr->ibmr.lkey = wqe->wr.wr.reg.key; - mr->ibmr.rkey = wqe->wr.wr.reg.key; - mr->iova = wqe->wr.wr.reg.mr->iova; - rxe_drop_ref(mr); + ret = rxe_reg_fast_mr(qp, wqe); + if (unlikely(ret)) { + wqe->status = IB_WC_LOC_QP_OP_ERR; + return ret; + } break; case IB_WR_BIND_MW: ret = rxe_bind_mw(qp, wqe); diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h index ac2a2148027f..d90b1d77de34 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.h +++ b/drivers/infiniband/sw/rxe/rxe_verbs.h @@ -313,6 +313,8 @@ struct rxe_mr { struct ib_umem *umem; + u32 lkey; + u32 rkey; enum rxe_mr_state state; enum rxe_mr_type type; u64 va; @@ -350,6 +352,7 @@ struct rxe_mw { enum rxe_mw_state state; struct rxe_qp *qp; /* Type 2 only */ struct rxe_mr *mr; + u32 rkey; int access; u64 addr; u64 length; @@ -474,26 +477,11 @@ static inline struct rxe_pd *mr_pd(struct rxe_mr *mr) return to_rpd(mr->ibmr.pd); } -static inline u32 mr_lkey(struct rxe_mr *mr) -{ - return mr->ibmr.lkey; -} - -static inline u32 mr_rkey(struct rxe_mr *mr) -{ - return mr->ibmr.rkey; -} - static inline struct rxe_pd *rxe_mw_pd(struct rxe_mw *mw) { return to_rpd(mw->ibmw.pd); } -static inline u32 rxe_mw_rkey(struct rxe_mw *mw) -{ - return mw->ibmw.rkey; -} - int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name); void rxe_mc_cleanup(struct rxe_pool_entry *arg); From patchwork Wed Sep 8 05:29:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bob Pearson X-Patchwork-Id: 12480167 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2BF4DC433EF for ; Wed, 8 Sep 2021 05:29:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1764261158 for ; Wed, 8 Sep 2021 05:29:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233353AbhIHFaw (ORCPT ); Wed, 8 Sep 2021 01:30:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49168 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343994AbhIHFar (ORCPT ); Wed, 8 Sep 2021 01:30:47 -0400 Received: from mail-oi1-x22c.google.com (mail-oi1-x22c.google.com [IPv6:2607:f8b0:4864:20::22c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8CCCEC06175F for ; Tue, 7 Sep 2021 22:29:40 -0700 (PDT) Received: by mail-oi1-x22c.google.com with SMTP id c79so1620982oib.11 for ; Tue, 07 Sep 2021 22:29:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=75vUxLgXOymeidljLUWAy1FCMBnONmzJHMFqUU6XvqY=; b=ltcbIk1ftO4vvMgZHkdrsYfH5KeiKRxLqm1Lv2tBMtXR5WR0Kkggc0jDEakZ00J/Kt BSBO1qfR8P+Rp4bQ6STVwJaWl766Z7Lptja0SGoF8zBSgp4UTubSS2zKSbKSiV7nDDd8 xkJKuQZLKEPdPpMlvX4zN2ffUJg/TnTfArbxXuJzKSLkqdV4zcyt+ilu9OSvUoR1NZc2 Kbg0qffWVx4CgeEGERqmZ/ZOh2VREqS08Xjn1Q0MhTRvqEwcfLsOLS+z/Xy1/cOySWre JkXzKuFzPua9AzBLY9NfdUYTpaGRXiosDQWFJq9i8bjdesCAFvxCNKBQdrq5p9Fp+ETs x81w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=75vUxLgXOymeidljLUWAy1FCMBnONmzJHMFqUU6XvqY=; b=4ST9pvTNE96zz8ZKH4hkeKaHtLo0UlM9nZnEMzvLZfzHYHfLf+E1eNDam9J3PWJEMT DTUnQ/9p+/0Xdm1ty3LbLlsjNaOf+8l6/FjArMLOdXbFBabBVF/p+3X4035u7fhDj2nJ 0OijqMVvzpfK4GmiC8kizlGbkQ6Fbqvxhw30+o6sGJu+0Y/RFPN4e0d9Eloyav7eli7T j6KZbwXdBJlcMGFkDKU6CjzCJpt1YXrp5UqanrUnL70lTe19z91MhYfrht/E4NK6TGcL 98WCyLGNIUDCYgCB0BddH9v6rZtYh5IGFVCIdEAzDUDNyhaTlzRM9YYDUfA54dD/GHSm HUgA== X-Gm-Message-State: AOAM532MEe4wVyuYhgIjwtySz6B9wtzkRRiprHaFuTw4s7cZKMtVOMk3 Ql1SwLNb7+Ztl1tNaEmV1Ak= X-Google-Smtp-Source: ABdhPJy8l3dxD8sdS5sBIohSrTP+lRajYoMu0ixMsW5Ti3XugZpRhuQTDfZEhX/iJ56oMOHTDcPsBA== X-Received: by 2002:aca:6041:: with SMTP id u62mr1254153oib.82.1631078979853; Tue, 07 Sep 2021 22:29:39 -0700 (PDT) Received: from ubunto-21.tx.rr.com (2603-8081-140c-1a00-4049-a9c6-d3dc-35fa.res6.spectrum.com. [2603:8081:140c:1a00:4049:a9c6:d3dc:35fa]) by smtp.gmail.com with ESMTPSA id bf6sm281183oib.0.2021.09.07.22.29.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 Sep 2021 22:29:39 -0700 (PDT) From: Bob Pearson To: jgg@nvidia.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org, bvanassche@acm.org Cc: Bob Pearson Subject: [PATCH for-next 4/5] RDMA/rxe: Create duplicate mapping tables for FMRs Date: Wed, 8 Sep 2021 00:29:27 -0500 Message-Id: <20210908052928.17375-5-rpearsonhpe@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210908052928.17375-1-rpearsonhpe@gmail.com> References: <20210908052928.17375-1-rpearsonhpe@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org For fast memory regions create duplicate mapping tables so ib_map_mr_sg() can build a new mapping table which is then swapped into place synchronously with the execution of an IB_WR_REG_MR work request. Signed-off-by: Bob Pearson --- drivers/infiniband/sw/rxe/rxe_loc.h | 1 + drivers/infiniband/sw/rxe/rxe_mr.c | 196 +++++++++++++++++--------- drivers/infiniband/sw/rxe/rxe_mw.c | 6 +- drivers/infiniband/sw/rxe/rxe_verbs.c | 39 ++--- drivers/infiniband/sw/rxe/rxe_verbs.h | 21 +-- 5 files changed, 162 insertions(+), 101 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h index 4fd73b51fabf..1ca43b859d80 100644 --- a/drivers/infiniband/sw/rxe/rxe_loc.h +++ b/drivers/infiniband/sw/rxe/rxe_loc.h @@ -87,6 +87,7 @@ int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length); int advance_dma_data(struct rxe_dma_info *dma, unsigned int length); int rxe_invalidate_mr(struct rxe_qp *qp, u32 rkey); int rxe_reg_fast_mr(struct rxe_qp *qp, struct rxe_send_wqe *wqe); +int rxe_mr_set_page(struct ib_mr *ibmr, u64 addr); int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata); void rxe_mr_cleanup(struct rxe_pool_entry *arg); diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c index bedcf15aaea7..c909e220e782 100644 --- a/drivers/infiniband/sw/rxe/rxe_mr.c +++ b/drivers/infiniband/sw/rxe/rxe_mr.c @@ -24,13 +24,15 @@ u8 rxe_get_next_key(u32 last_key) int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length) { + struct rxe_map_set *set = mr->cur_map_set; + switch (mr->type) { case RXE_MR_TYPE_DMA: return 0; case RXE_MR_TYPE_MR: - if (iova < mr->iova || length > mr->length || - iova > mr->iova + mr->length - length) + if (iova < set->iova || length > set->length || + iova > set->iova + set->length - length) return -EFAULT; return 0; @@ -61,41 +63,89 @@ static void rxe_mr_init(int access, struct rxe_mr *mr) mr->map_shift = ilog2(RXE_BUF_PER_MAP); } -static int rxe_mr_alloc(struct rxe_mr *mr, int num_buf) +static void rxe_mr_free_map_set(int num_map, struct rxe_map_set *set) { int i; - int num_map; - struct rxe_map **map = mr->map; - num_map = (num_buf + RXE_BUF_PER_MAP - 1) / RXE_BUF_PER_MAP; + for (i = 0; i < num_map; i++) + kfree(set->map[i]); - mr->map = kmalloc_array(num_map, sizeof(*map), GFP_KERNEL); - if (!mr->map) - goto err1; + kfree(set->map); + kfree(set); +} + +static int rxe_mr_alloc_map_set(int num_map, struct rxe_map_set **setp) +{ + int i; + struct rxe_map_set *set; + + set = kmalloc(sizeof(*set), GFP_KERNEL); + if (!set) + goto err_out; + + set->map = kmalloc_array(num_map, sizeof(struct rxe_map *), GFP_KERNEL); + if (!set->map) + goto err_free_set; for (i = 0; i < num_map; i++) { - mr->map[i] = kmalloc(sizeof(**map), GFP_KERNEL); - if (!mr->map[i]) - goto err2; + set->map[i] = kmalloc(sizeof(struct rxe_map), GFP_KERNEL); + if (!set->map[i]) + goto err_free_map; } + *setp = set; + + return 0; + +err_free_map: + for (i--; i >= 0; i--) + kfree(set->map[i]); + + kfree(set->map); +err_free_set: + kfree(set); +err_out: + return -ENOMEM; +} + +/** + * rxe_mr_alloc() - Allocate memory map array(s) for MR + * @mr: Memory region + * @num_buf: Number of buffer descriptors to support + * @both: If non zero allocate both mr->map and mr->next_map + * else just allocate mr->map. Used for fast MRs + * + * Return: 0 on success else an error + */ +static int rxe_mr_alloc(struct rxe_mr *mr, int num_buf, int both) +{ + int ret; + int num_map; + BUILD_BUG_ON(!is_power_of_2(RXE_BUF_PER_MAP)); + num_map = (num_buf + RXE_BUF_PER_MAP - 1) / RXE_BUF_PER_MAP; mr->map_shift = ilog2(RXE_BUF_PER_MAP); mr->map_mask = RXE_BUF_PER_MAP - 1; - mr->num_buf = num_buf; - mr->num_map = num_map; mr->max_buf = num_map * RXE_BUF_PER_MAP; + mr->num_map = num_map; - return 0; + ret = rxe_mr_alloc_map_set(num_map, &mr->cur_map_set); + if (ret) + goto err_out; -err2: - for (i--; i >= 0; i--) - kfree(mr->map[i]); + if (both) { + ret = rxe_mr_alloc_map_set(num_map, &mr->next_map_set); + if (ret) { + rxe_mr_free_map_set(mr->num_map, mr->cur_map_set); + goto err_out; + } + } - kfree(mr->map); -err1: + return 0; + +err_out: return -ENOMEM; } @@ -112,6 +162,7 @@ void rxe_mr_init_dma(struct rxe_pd *pd, int access, struct rxe_mr *mr) int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova, int access, struct rxe_mr *mr) { + struct rxe_map_set *set; struct rxe_map **map; struct rxe_phys_buf *buf = NULL; struct ib_umem *umem; @@ -119,7 +170,6 @@ int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova, int num_buf; void *vaddr; int err; - int i; umem = ib_umem_get(pd->ibpd.device, start, length, access); if (IS_ERR(umem)) { @@ -133,18 +183,20 @@ int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova, rxe_mr_init(access, mr); - err = rxe_mr_alloc(mr, num_buf); + err = rxe_mr_alloc(mr, num_buf, 0); if (err) { pr_warn("%s: Unable to allocate memory for map\n", __func__); goto err_release_umem; } - mr->page_shift = PAGE_SHIFT; - mr->page_mask = PAGE_SIZE - 1; + set = mr->cur_map_set; + set->page_shift = PAGE_SHIFT; + set->page_mask = PAGE_SIZE - 1; + + num_buf = 0; + map = set->map; - num_buf = 0; - map = mr->map; if (length > 0) { buf = map[0]->buf; @@ -167,26 +219,24 @@ int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova, buf->size = PAGE_SIZE; num_buf++; buf++; - } } mr->ibmr.pd = &pd->ibpd; mr->umem = umem; mr->access = access; - mr->length = length; - mr->iova = iova; - mr->va = start; - mr->offset = ib_umem_offset(umem); mr->state = RXE_MR_STATE_VALID; mr->type = RXE_MR_TYPE_MR; + set->length = length; + set->iova = iova; + set->va = start; + set->offset = ib_umem_offset(umem); + return 0; err_cleanup_map: - for (i = 0; i < mr->num_map; i++) - kfree(mr->map[i]); - kfree(mr->map); + rxe_mr_free_map_set(mr->num_map, mr->cur_map_set); err_release_umem: ib_umem_release(umem); err_out: @@ -200,7 +250,7 @@ int rxe_mr_init_fast(struct rxe_pd *pd, int max_pages, struct rxe_mr *mr) /* always allow remote access for FMRs */ rxe_mr_init(IB_ACCESS_REMOTE, mr); - err = rxe_mr_alloc(mr, max_pages); + err = rxe_mr_alloc(mr, max_pages, 1); if (err) goto err1; @@ -218,21 +268,24 @@ int rxe_mr_init_fast(struct rxe_pd *pd, int max_pages, struct rxe_mr *mr) static void lookup_iova(struct rxe_mr *mr, u64 iova, int *m_out, int *n_out, size_t *offset_out) { - size_t offset = iova - mr->iova + mr->offset; + struct rxe_map_set *set = mr->cur_map_set; + size_t offset = iova - set->iova + set->offset; int map_index; int buf_index; u64 length; + struct rxe_map *map; - if (likely(mr->page_shift)) { - *offset_out = offset & mr->page_mask; - offset >>= mr->page_shift; + if (likely(set->page_shift)) { + *offset_out = offset & set->page_mask; + offset >>= set->page_shift; *n_out = offset & mr->map_mask; *m_out = offset >> mr->map_shift; } else { map_index = 0; buf_index = 0; - length = mr->map[map_index]->buf[buf_index].size; + map = set->map[map_index]; + length = map->buf[buf_index].size; while (offset >= length) { offset -= length; @@ -242,7 +295,8 @@ static void lookup_iova(struct rxe_mr *mr, u64 iova, int *m_out, int *n_out, map_index++; buf_index = 0; } - length = mr->map[map_index]->buf[buf_index].size; + map = set->map[map_index]; + length = map->buf[buf_index].size; } *m_out = map_index; @@ -263,7 +317,7 @@ void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length) goto out; } - if (!mr->map) { + if (!mr->cur_map_set) { addr = (void *)(uintptr_t)iova; goto out; } @@ -276,13 +330,13 @@ void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length) lookup_iova(mr, iova, &m, &n, &offset); - if (offset + length > mr->map[m]->buf[n].size) { + if (offset + length > mr->cur_map_set->map[m]->buf[n].size) { pr_warn("crosses page boundary\n"); addr = NULL; goto out; } - addr = (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset; + addr = (void *)(uintptr_t)mr->cur_map_set->map[m]->buf[n].addr + offset; out: return addr; @@ -318,7 +372,7 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length, return 0; } - WARN_ON_ONCE(!mr->map); + WARN_ON_ONCE(!mr->cur_map_set); err = mr_check_range(mr, iova, length); if (err) { @@ -328,7 +382,7 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length, lookup_iova(mr, iova, &m, &i, &offset); - map = mr->map + m; + map = mr->cur_map_set->map + m; buf = map[0]->buf + i; while (length > 0) { @@ -568,8 +622,9 @@ int rxe_invalidate_mr(struct rxe_qp *qp, u32 rkey) int rxe_reg_fast_mr(struct rxe_qp *qp, struct rxe_send_wqe *wqe) { struct rxe_mr *mr = to_rmr(wqe->wr.wr.reg.mr); - u32 key = wqe->wr.wr.reg.key; + u32 key = wqe->wr.wr.reg.key & 0xff; u32 access = wqe->wr.wr.reg.access; + struct rxe_map_set *set; /* user can only register MR in free state */ if (unlikely(mr->state != RXE_MR_STATE_FREE)) { @@ -585,19 +640,36 @@ int rxe_reg_fast_mr(struct rxe_qp *qp, struct rxe_send_wqe *wqe) return -EINVAL; } - /* user is only allowed to change key portion of l/rkey */ - if (unlikely((mr->lkey & ~0xff) != (key & ~0xff))) { - pr_warn("%s: key = 0x%x has wrong index mr->lkey = 0x%x\n", - __func__, key, mr->lkey); - return -EINVAL; - } - mr->access = access; - mr->lkey = key; - mr->rkey = (access & IB_ACCESS_REMOTE) ? key : 0; - mr->iova = wqe->wr.wr.reg.mr->iova; + mr->lkey = (mr->lkey & ~0xff) | key; + mr->rkey = (access & IB_ACCESS_REMOTE) ? mr->lkey : 0; mr->state = RXE_MR_STATE_VALID; + set = mr->cur_map_set; + mr->cur_map_set = mr->next_map_set; + mr->cur_map_set->iova = wqe->wr.wr.reg.mr->iova; + mr->next_map_set = set; + + return 0; +} + +int rxe_mr_set_page(struct ib_mr *ibmr, u64 addr) +{ + struct rxe_mr *mr = to_rmr(ibmr); + struct rxe_map_set *set = mr->next_map_set; + struct rxe_map *map; + struct rxe_phys_buf *buf; + + if (unlikely(set->nbuf == mr->num_buf)) + return -ENOMEM; + + map = set->map[set->nbuf / RXE_BUF_PER_MAP]; + buf = &map->buf[set->nbuf % RXE_BUF_PER_MAP]; + + buf->addr = addr; + buf->size = ibmr->page_size; + set->nbuf++; + return 0; } @@ -622,14 +694,12 @@ int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) void rxe_mr_cleanup(struct rxe_pool_entry *arg) { struct rxe_mr *mr = container_of(arg, typeof(*mr), pelem); - int i; ib_umem_release(mr->umem); - if (mr->map) { - for (i = 0; i < mr->num_map; i++) - kfree(mr->map[i]); + if (mr->cur_map_set) + rxe_mr_free_map_set(mr->num_map, mr->cur_map_set); - kfree(mr->map); - } + if (mr->next_map_set) + rxe_mr_free_map_set(mr->num_map, mr->next_map_set); } diff --git a/drivers/infiniband/sw/rxe/rxe_mw.c b/drivers/infiniband/sw/rxe/rxe_mw.c index a5e2ea7d80f0..9534a7fe1a98 100644 --- a/drivers/infiniband/sw/rxe/rxe_mw.c +++ b/drivers/infiniband/sw/rxe/rxe_mw.c @@ -142,15 +142,15 @@ static int rxe_check_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe, /* C10-75 */ if (mw->access & IB_ZERO_BASED) { - if (unlikely(wqe->wr.wr.mw.length > mr->length)) { + if (unlikely(wqe->wr.wr.mw.length > mr->cur_map_set->length)) { pr_err_once( "attempt to bind a ZB MW outside of the MR\n"); return -EINVAL; } } else { - if (unlikely((wqe->wr.wr.mw.addr < mr->iova) || + if (unlikely((wqe->wr.wr.mw.addr < mr->cur_map_set->iova) || ((wqe->wr.wr.mw.addr + wqe->wr.wr.mw.length) > - (mr->iova + mr->length)))) { + (mr->cur_map_set->iova + mr->cur_map_set->length)))) { pr_err_once( "attempt to bind a VA MW outside of the MR\n"); return -EINVAL; diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c index dc70e3edeba6..e7f482184359 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.c +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c @@ -954,41 +954,26 @@ static struct ib_mr *rxe_alloc_mr(struct ib_pd *ibpd, enum ib_mr_type mr_type, return ERR_PTR(err); } -static int rxe_set_page(struct ib_mr *ibmr, u64 addr) -{ - struct rxe_mr *mr = to_rmr(ibmr); - struct rxe_map *map; - struct rxe_phys_buf *buf; - - if (unlikely(mr->nbuf == mr->num_buf)) - return -ENOMEM; - - map = mr->map[mr->nbuf / RXE_BUF_PER_MAP]; - buf = &map->buf[mr->nbuf % RXE_BUF_PER_MAP]; - - buf->addr = addr; - buf->size = ibmr->page_size; - mr->nbuf++; - - return 0; -} - +/* build next_map_set from scatterlist + * The IB_WR_REG_MR WR will swap map_sets + */ static int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg, int sg_nents, unsigned int *sg_offset) { struct rxe_mr *mr = to_rmr(ibmr); + struct rxe_map_set *set = mr->next_map_set; int n; - mr->nbuf = 0; + set->nbuf = 0; - n = ib_sg_to_pages(ibmr, sg, sg_nents, sg_offset, rxe_set_page); + n = ib_sg_to_pages(ibmr, sg, sg_nents, sg_offset, rxe_mr_set_page); - mr->va = ibmr->iova; - mr->iova = ibmr->iova; - mr->length = ibmr->length; - mr->page_shift = ilog2(ibmr->page_size); - mr->page_mask = ibmr->page_size - 1; - mr->offset = mr->iova & mr->page_mask; + set->va = ibmr->iova; + set->iova = ibmr->iova; + set->length = ibmr->length; + set->page_shift = ilog2(ibmr->page_size); + set->page_mask = ibmr->page_size - 1; + set->offset = set->iova & set->page_mask; return n; } diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h index d90b1d77de34..87c9e8ed55ad 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.h +++ b/drivers/infiniband/sw/rxe/rxe_verbs.h @@ -300,6 +300,17 @@ struct rxe_map { struct rxe_phys_buf buf[RXE_BUF_PER_MAP]; }; +struct rxe_map_set { + struct rxe_map **map; + u64 va; + u64 iova; + size_t length; + u32 offset; + u32 nbuf; + int page_shift; + int page_mask; +}; + static inline int rkey_is_mw(u32 rkey) { u32 index = rkey >> 8; @@ -317,26 +328,20 @@ struct rxe_mr { u32 rkey; enum rxe_mr_state state; enum rxe_mr_type type; - u64 va; - u64 iova; - size_t length; - u32 offset; int access; - int page_shift; - int page_mask; int map_shift; int map_mask; u32 num_buf; - u32 nbuf; u32 max_buf; u32 num_map; atomic_t num_mw; - struct rxe_map **map; + struct rxe_map_set *cur_map_set; + struct rxe_map_set *next_map_set; }; enum rxe_mw_state { From patchwork Wed Sep 8 05:29:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bob Pearson X-Patchwork-Id: 12480165 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0F6AC43219 for ; Wed, 8 Sep 2021 05:29:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 895FD61158 for ; Wed, 8 Sep 2021 05:29:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245528AbhIHFaw (ORCPT ); Wed, 8 Sep 2021 01:30:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49174 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344569AbhIHFas (ORCPT ); Wed, 8 Sep 2021 01:30:48 -0400 Received: from mail-ot1-x334.google.com (mail-ot1-x334.google.com [IPv6:2607:f8b0:4864:20::334]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EBF9BC0617A8 for ; Tue, 7 Sep 2021 22:29:40 -0700 (PDT) Received: by mail-ot1-x334.google.com with SMTP id a20-20020a0568300b9400b0051b8ca82dfcso1459993otv.3 for ; Tue, 07 Sep 2021 22:29:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=SSA6K+yNO+cLEa6dQ9PEN6abRHzj8+2fVqUrLKShSY0=; b=i6eL2xrjWOd8BKj8Tffy5OXkzrNiht0zCujlFlyRZWAnJyp9V/4cYMc1db4+xR0W9l qgzWEgvhnXDjqzHUaB+3E26/RSd8O8Tg5/5C7Szbzrs32srxE/dzm/EjC8Tv23YTLKkQ 7JWM3vq/pVZ68ObDB9k/eruxMBl6p0O3GrxfXA8OwiEp/91NHk6kneeZ5qONRzSlxOJK jog3ING0IAplS317gaEsy36VkIbaH48Kh+yknMzIp7yk8vpvOmGEIg1BmTP82kcHeO7n 38KKThQERsUxVNmuMebwGssmcCMaEz9ADv62FbSrOHGhQ5iPjSYkLC6ig/8QzE7c+7TY 0o1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=SSA6K+yNO+cLEa6dQ9PEN6abRHzj8+2fVqUrLKShSY0=; b=jtkY2yv887aJ0Y8EikrMKrmuDcH3RQuC2SYNkSs7f+6e/bqHCGytEYgd/kj8MEF3cF eDSh8DhMGzwLp2vwvsk/VYTzM5TBDJCe3+NSDnYWD5CEa2Fm9vC8BWSYmZdsxg4x/pQX TREa/CBtxV17YlOZWnjJetcK+vEkRUxJo9AwF4Iw2aFflqK+8091xNWEqBtAbZDj2RqH RfZfdPb3jk45+4cGJBAuOyLP6bxFHbKFCUKm4XqaVgz9RPELgKdrBU+DQCMXz14+KlNp 3WNaklxBQRBc/vAOD8zwZj3yGbTTrPEuXHDh6D9JXPfDWU73YJ+ZN8XMUmgOox4U1u78 mxlA== X-Gm-Message-State: AOAM531RwPN5a+4DfwfPxTKkoeNmV+8To7/wvJuHL1p5Jniz95c8tbJ7 vGpJ6akTBcpMkBNIpJ/I0aAspdVVGLGo6Q== X-Google-Smtp-Source: ABdhPJxV+i3G/mU8wbnRjMnen+FtB8rPpxinQ2Nm+h+Pk9NRBWRFmJMW2rdCqJTrVHhBxaAI2Dbqxw== X-Received: by 2002:a9d:74c5:: with SMTP id a5mr1618641otl.205.1631078980346; Tue, 07 Sep 2021 22:29:40 -0700 (PDT) Received: from ubunto-21.tx.rr.com (2603-8081-140c-1a00-4049-a9c6-d3dc-35fa.res6.spectrum.com. [2603:8081:140c:1a00:4049:a9c6:d3dc:35fa]) by smtp.gmail.com with ESMTPSA id bf6sm281183oib.0.2021.09.07.22.29.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 Sep 2021 22:29:40 -0700 (PDT) From: Bob Pearson To: jgg@nvidia.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org, bvanassche@acm.org Cc: Bob Pearson Subject: [PATCH for-next 5/5] RDMA/rxe: Cleanup MR status and type enums Date: Wed, 8 Sep 2021 00:29:28 -0500 Message-Id: <20210908052928.17375-6-rpearsonhpe@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210908052928.17375-1-rpearsonhpe@gmail.com> References: <20210908052928.17375-1-rpearsonhpe@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Eliminate RXE_MR_STATE_ZOMBIE which is not compatible with IBA. RXE_MR_STATE_INVALID is better. Replace RXE_MR_TYPE_XXX by IB_MR_TYPE_XXX which covers all the needed types. Signed-off-by: Bob Pearson --- drivers/infiniband/sw/rxe/rxe_mr.c | 30 ++++++++++++++++++--------- drivers/infiniband/sw/rxe/rxe_verbs.h | 9 +------- 2 files changed, 21 insertions(+), 18 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c index c909e220e782..7c75f66357bc 100644 --- a/drivers/infiniband/sw/rxe/rxe_mr.c +++ b/drivers/infiniband/sw/rxe/rxe_mr.c @@ -27,16 +27,19 @@ int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length) struct rxe_map_set *set = mr->cur_map_set;; switch (mr->type) { - case RXE_MR_TYPE_DMA: + case IB_MR_TYPE_DMA: return 0; - case RXE_MR_TYPE_MR: + case IB_MR_TYPE_USER: + case IB_MR_TYPE_MEM_REG: if (iova < set->iova || length > set->length || iova > set->iova + set->length - length) return -EFAULT; return 0; default: + pr_warn("%s: mr type (%d) not supported\n", + __func__, mr->type); return -EFAULT; } } @@ -59,7 +62,7 @@ static void rxe_mr_init(int access, struct rxe_mr *mr) mr->rkey = mr->ibmr.rkey = rkey; mr->state = RXE_MR_STATE_INVALID; - mr->type = RXE_MR_TYPE_NONE; + mr->type = -1; mr->map_shift = ilog2(RXE_BUF_PER_MAP); } @@ -156,7 +159,7 @@ void rxe_mr_init_dma(struct rxe_pd *pd, int access, struct rxe_mr *mr) mr->ibmr.pd = &pd->ibpd; mr->access = access; mr->state = RXE_MR_STATE_VALID; - mr->type = RXE_MR_TYPE_DMA; + mr->type = IB_MR_TYPE_DMA; } int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova, @@ -226,7 +229,7 @@ int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova, mr->umem = umem; mr->access = access; mr->state = RXE_MR_STATE_VALID; - mr->type = RXE_MR_TYPE_MR; + mr->type = IB_MR_TYPE_USER; set->length = length; set->iova = iova; @@ -257,7 +260,7 @@ int rxe_mr_init_fast(struct rxe_pd *pd, int max_pages, struct rxe_mr *mr) mr->ibmr.pd = &pd->ibpd; mr->max_buf = max_pages; mr->state = RXE_MR_STATE_FREE; - mr->type = RXE_MR_TYPE_MR; + mr->type = IB_MR_TYPE_MEM_REG; return 0; @@ -360,7 +363,7 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length, if (length == 0) return 0; - if (mr->type == RXE_MR_TYPE_DMA) { + if (mr->type == IB_MR_TYPE_DMA) { u8 *src, *dest; src = (dir == RXE_TO_MR_OBJ) ? addr : ((void *)(uintptr_t)iova); @@ -628,8 +631,15 @@ int rxe_reg_fast_mr(struct rxe_qp *qp, struct rxe_send_wqe *wqe) /* user can only register MR in free state */ if (unlikely(mr->state != RXE_MR_STATE_FREE)) { - pr_warn("%s: mr->lkey = 0x%x not free\n", - __func__, mr->lkey); + pr_warn("%s: mr->state = %d not free\n", + __func__, mr->state); + return -EINVAL; + } + + /* user can only register MR of type IB_MR_TYPE_MEM_REG */ + if (unlikely(mr->type != IB_MR_TYPE_MEM_REG)) { + pr_warn("%s: mr->type = %d wrong type\n", + __func__, mr->type); return -EINVAL; } @@ -683,7 +693,7 @@ int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) return -EINVAL; } - mr->state = RXE_MR_STATE_ZOMBIE; + mr->state = RXE_MR_STATE_INVALID; rxe_drop_ref(mr_pd(mr)); rxe_drop_index(mr); rxe_drop_ref(mr); diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h index 87c9e8ed55ad..9eabc8f30359 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.h +++ b/drivers/infiniband/sw/rxe/rxe_verbs.h @@ -267,18 +267,11 @@ struct rxe_qp { }; enum rxe_mr_state { - RXE_MR_STATE_ZOMBIE, RXE_MR_STATE_INVALID, RXE_MR_STATE_FREE, RXE_MR_STATE_VALID, }; -enum rxe_mr_type { - RXE_MR_TYPE_NONE, - RXE_MR_TYPE_DMA, - RXE_MR_TYPE_MR, -}; - enum rxe_mr_copy_dir { RXE_TO_MR_OBJ, RXE_FROM_MR_OBJ, @@ -327,7 +320,7 @@ struct rxe_mr { u32 lkey; u32 rkey; enum rxe_mr_state state; - enum rxe_mr_type type; + enum ib_mr_type type; int access; int map_shift;