From patchwork Thu Sep 3 22:40:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bob Pearson X-Patchwork-Id: 11755455 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 42637109A for ; Thu, 3 Sep 2020 22:41:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0881C20716 for ; Thu, 3 Sep 2020 22:41:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lb+7IzD8" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729278AbgICWlz (ORCPT ); Thu, 3 Sep 2020 18:41:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55684 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728775AbgICWlm (ORCPT ); Thu, 3 Sep 2020 18:41:42 -0400 Received: from mail-ot1-x343.google.com (mail-ot1-x343.google.com [IPv6:2607:f8b0:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F3FEC061246 for ; Thu, 3 Sep 2020 15:41:42 -0700 (PDT) Received: by mail-ot1-x343.google.com with SMTP id a2so4195534otr.11 for ; Thu, 03 Sep 2020 15:41:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=uPo6UbHwZNAD+2inUpmZtwNNjTQko0jKqxwhn5GlN90=; b=lb+7IzD8kLffzJz+1XUBR3Fg2L+VvzfIIRhalxtyRSQo5xfCn3SNHQo8MeEkjXpQ6B IhPXIYBsb2xsgm/kHcoPmL61bc+QYoRAREqCR6CZ7EzJAU/qKWBPAVMK6VC+pH0cTo03 tbdYfnKd56034nqrKwBRshf3ioRkhFcDNWI4wb1fnc/Vv/EPYmJF1NOWRpt9KFOT7bqR u9StTeo90PP1MxZQ59VmB4FIfrr/W9EC3mnr4ZAq2Uw/wR7RtrCkbVK6+QcPhhnOPbu3 g/WRYVTOJ153dMwGLWCl5OqLUO1sGj2J3ok1jk0ZCYb7wKQVSZAWuDc+MKA3bpX1NkQD tZEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=uPo6UbHwZNAD+2inUpmZtwNNjTQko0jKqxwhn5GlN90=; b=hx55i/H/wZioxEk3l/c4/q8E1cuTugJ15owXW0yKCJxdU4Yup4PPFOGcJiKhGE3ZSZ wrGmiPSTaCy0vN/2dAbKbYbHa+7ysepEMjYc9BgAmKgN4kLT0rUQXrvefdPCqmfhtpPe C+iXfc/Zi8bq6CQwbE1Th027+MNu1fmpPdblW3k+sQQi8eBng5vcghnZ+Nwk6BSTV1OX aPL0S6hK69scr4Pv2EGOq1w1ntIYNLvmOUKevZv+KkmEWGz78eGznk35djYed/ANpD// EBybWomVLt83XO8PObmV8H6mtVrcZ8vCBkZp4YJ0yZc6DWJ67FFcn25lfz7BJL4PY6t5 hg/A== X-Gm-Message-State: AOAM531qOXqYLePabO1CBF0g2BPCSxVblH2IJky5sGIqlzzGUJvrxqsW YHoz/VEByTVD5ETPQJ9XLQw= X-Google-Smtp-Source: ABdhPJydvZmAvb3jjxMJjihlTmbWJTNmjPk7Z2sHGQ/GFXDTH7yiT79BdjEaRvuf8Ob/O9w6+sx7AQ== X-Received: by 2002:a9d:61c5:: with SMTP id h5mr3276632otk.69.1599172901633; Thu, 03 Sep 2020 15:41:41 -0700 (PDT) Received: from localhost ([2605:6000:8b03:f000:6a3a:fc5c:851c:306a]) by smtp.gmail.com with ESMTPSA id c124sm771890oib.22.2020.09.03.15.41.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Sep 2020 15:41:41 -0700 (PDT) From: Bob Pearson X-Google-Original-From: Bob Pearson To: jgg@nvidia.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org Cc: Bob Pearson Subject: [PATCH v4 for-next 2/7] rdma_rxe: Separated MEM into MR and MW objects. Date: Thu, 3 Sep 2020 17:40:35 -0500 Message-Id: <20200903224039.437391-3-rpearson@hpe.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200903224039.437391-1-rpearson@hpe.com> References: <20200903224039.437391-1-rpearson@hpe.com> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org In the original rxe implementation it was intended to use a common object to represent MRs and MWs but it became clear that they are different enough to separate these into two objects. This allows replacing the mem name with mr for MRs which is more consistent with the style for the other objects and less likely to be confusing. This is a long patch that mostly changes mem to mr where it makes sense and adds a new rxe_mw struct. Signed-off-by: Bob Pearson --- drivers/infiniband/sw/rxe/rxe_comp.c | 4 +- drivers/infiniband/sw/rxe/rxe_loc.h | 26 +-- drivers/infiniband/sw/rxe/rxe_mr.c | 264 +++++++++++++------------- drivers/infiniband/sw/rxe/rxe_pool.c | 8 +- drivers/infiniband/sw/rxe/rxe_req.c | 6 +- drivers/infiniband/sw/rxe/rxe_resp.c | 30 +-- drivers/infiniband/sw/rxe/rxe_verbs.c | 18 +- drivers/infiniband/sw/rxe/rxe_verbs.h | 51 ++--- 8 files changed, 204 insertions(+), 203 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c index 0a1e6393250b..5dc86c9e74c2 100644 --- a/drivers/infiniband/sw/rxe/rxe_comp.c +++ b/drivers/infiniband/sw/rxe/rxe_comp.c @@ -345,7 +345,7 @@ static inline enum comp_state do_read(struct rxe_qp *qp, ret = copy_data(qp->pd, IB_ACCESS_LOCAL_WRITE, &wqe->dma, payload_addr(pkt), - payload_size(pkt), to_mem_obj, NULL); + payload_size(pkt), to_mr_obj, NULL); if (ret) return COMPST_ERROR; @@ -365,7 +365,7 @@ static inline enum comp_state do_atomic(struct rxe_qp *qp, ret = copy_data(qp->pd, IB_ACCESS_LOCAL_WRITE, &wqe->dma, &atomic_orig, - sizeof(u64), to_mem_obj, NULL); + sizeof(u64), to_mr_obj, NULL); if (ret) return COMPST_ERROR; else diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h index 0d758760b9ae..9ec6bff6863f 100644 --- a/drivers/infiniband/sw/rxe/rxe_loc.h +++ b/drivers/infiniband/sw/rxe/rxe_loc.h @@ -72,40 +72,40 @@ int rxe_mmap(struct ib_ucontext *context, struct vm_area_struct *vma); /* rxe_mr.c */ enum copy_direction { - to_mem_obj, - from_mem_obj, + to_mr_obj, + from_mr_obj, }; -void rxe_mem_init_dma(struct rxe_pd *pd, - int access, struct rxe_mem *mem); +void rxe_mr_init_dma(struct rxe_pd *pd, + int access, struct rxe_mr *mr); -int rxe_mem_init_user(struct rxe_pd *pd, u64 start, +int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova, int access, struct ib_udata *udata, - struct rxe_mem *mr); + struct rxe_mr *mr); -int rxe_mem_init_fast(struct rxe_pd *pd, - int max_pages, struct rxe_mem *mem); +int rxe_mr_init_fast(struct rxe_pd *pd, + int max_pages, struct rxe_mr *mr); -int rxe_mem_copy(struct rxe_mem *mem, u64 iova, void *addr, +int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length, enum copy_direction dir, u32 *crcp); int copy_data(struct rxe_pd *pd, int access, struct rxe_dma_info *dma, void *addr, int length, enum copy_direction dir, u32 *crcp); -void *iova_to_vaddr(struct rxe_mem *mem, u64 iova, int length); +void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length); enum lookup_type { lookup_local, lookup_remote, }; -struct rxe_mem *lookup_mem(struct rxe_pd *pd, int access, u32 key, +struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key, enum lookup_type type); -int mem_check_range(struct rxe_mem *mem, u64 iova, size_t length); +int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length); -void rxe_mem_cleanup(struct rxe_pool_entry *arg); +void rxe_mr_cleanup(struct rxe_pool_entry *arg); int advance_dma_data(struct rxe_dma_info *dma, unsigned int length); diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c index 708e2dff5eaa..368012904879 100644 --- a/drivers/infiniband/sw/rxe/rxe_mr.c +++ b/drivers/infiniband/sw/rxe/rxe_mr.c @@ -24,17 +24,17 @@ static u8 rxe_get_key(void) return key; } -int mem_check_range(struct rxe_mem *mem, u64 iova, size_t length) +int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length) { - switch (mem->type) { - case RXE_MEM_TYPE_DMA: + switch (mr->type) { + case RXE_MR_TYPE_DMA: return 0; - case RXE_MEM_TYPE_MR: - case RXE_MEM_TYPE_FMR: - if (iova < mem->iova || - length > mem->length || - iova > mem->iova + mem->length - length) + case RXE_MR_TYPE_MR: + case RXE_MR_TYPE_FMR: + if (iova < mr->iova || + length > mr->length || + iova > mr->iova + mr->length - length) return -EFAULT; return 0; @@ -47,90 +47,90 @@ int mem_check_range(struct rxe_mem *mem, u64 iova, size_t length) | IB_ACCESS_REMOTE_WRITE \ | IB_ACCESS_REMOTE_ATOMIC) -static void rxe_mem_init(int access, struct rxe_mem *mem) +static void rxe_mr_init(int access, struct rxe_mr *mr) { - u32 lkey = mem->pelem.index << 8 | rxe_get_key(); + u32 lkey = mr->pelem.index << 8 | rxe_get_key(); u32 rkey = (access & IB_ACCESS_REMOTE) ? lkey : 0; - if (mem->pelem.pool->type == RXE_TYPE_MR) { - mem->ibmr.lkey = lkey; - mem->ibmr.rkey = rkey; + if (mr->pelem.pool->type == RXE_TYPE_MR) { + mr->ibmr.lkey = lkey; + mr->ibmr.rkey = rkey; } - mem->lkey = lkey; - mem->rkey = rkey; - mem->state = RXE_MEM_STATE_INVALID; - mem->type = RXE_MEM_TYPE_NONE; - mem->map_shift = ilog2(RXE_BUF_PER_MAP); + mr->lkey = lkey; + mr->rkey = rkey; + mr->state = RXE_MEM_STATE_INVALID; + mr->type = RXE_MR_TYPE_NONE; + mr->map_shift = ilog2(RXE_BUF_PER_MAP); } -void rxe_mem_cleanup(struct rxe_pool_entry *arg) +void rxe_mr_cleanup(struct rxe_pool_entry *arg) { - struct rxe_mem *mem = container_of(arg, typeof(*mem), pelem); + struct rxe_mr *mr = container_of(arg, typeof(*mr), pelem); int i; - ib_umem_release(mem->umem); + ib_umem_release(mr->umem); - if (mem->map) { - for (i = 0; i < mem->num_map; i++) - kfree(mem->map[i]); + if (mr->map) { + for (i = 0; i < mr->num_map; i++) + kfree(mr->map[i]); - kfree(mem->map); + kfree(mr->map); } } -static int rxe_mem_alloc(struct rxe_mem *mem, int num_buf) +static int rxe_mr_alloc(struct rxe_mr *mr, int num_buf) { int i; int num_map; - struct rxe_map **map = mem->map; + struct rxe_map **map = mr->map; num_map = (num_buf + RXE_BUF_PER_MAP - 1) / RXE_BUF_PER_MAP; - mem->map = kmalloc_array(num_map, sizeof(*map), GFP_KERNEL); - if (!mem->map) + mr->map = kmalloc_array(num_map, sizeof(*map), GFP_KERNEL); + if (!mr->map) goto err1; for (i = 0; i < num_map; i++) { - mem->map[i] = kmalloc(sizeof(**map), GFP_KERNEL); - if (!mem->map[i]) + mr->map[i] = kmalloc(sizeof(**map), GFP_KERNEL); + if (!mr->map[i]) goto err2; } BUILD_BUG_ON(!is_power_of_2(RXE_BUF_PER_MAP)); - mem->map_shift = ilog2(RXE_BUF_PER_MAP); - mem->map_mask = RXE_BUF_PER_MAP - 1; + mr->map_shift = ilog2(RXE_BUF_PER_MAP); + mr->map_mask = RXE_BUF_PER_MAP - 1; - mem->num_buf = num_buf; - mem->num_map = num_map; - mem->max_buf = num_map * RXE_BUF_PER_MAP; + mr->num_buf = num_buf; + mr->num_map = num_map; + mr->max_buf = num_map * RXE_BUF_PER_MAP; return 0; err2: for (i--; i >= 0; i--) - kfree(mem->map[i]); + kfree(mr->map[i]); - kfree(mem->map); + kfree(mr->map); err1: return -ENOMEM; } -void rxe_mem_init_dma(struct rxe_pd *pd, - int access, struct rxe_mem *mem) +void rxe_mr_init_dma(struct rxe_pd *pd, + int access, struct rxe_mr *mr) { - rxe_mem_init(access, mem); + rxe_mr_init(access, mr); - mem->pd = pd; - mem->access = access; - mem->state = RXE_MEM_STATE_VALID; - mem->type = RXE_MEM_TYPE_DMA; + mr->pd = pd; + mr->access = access; + mr->state = RXE_MEM_STATE_VALID; + mr->type = RXE_MR_TYPE_DMA; } -int rxe_mem_init_user(struct rxe_pd *pd, u64 start, +int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova, int access, struct ib_udata *udata, - struct rxe_mem *mem) + struct rxe_mr *mr) { struct rxe_map **map; struct rxe_phys_buf *buf = NULL; @@ -148,23 +148,23 @@ int rxe_mem_init_user(struct rxe_pd *pd, u64 start, goto err1; } - mem->umem = umem; + mr->umem = umem; num_buf = ib_umem_num_pages(umem); - rxe_mem_init(access, mem); + rxe_mr_init(access, mr); - err = rxe_mem_alloc(mem, num_buf); + err = rxe_mr_alloc(mr, num_buf); if (err) { - pr_warn("err %d from rxe_mem_alloc\n", err); + pr_warn("err %d from rxe_mr_alloc\n", err); ib_umem_release(umem); goto err1; } - mem->page_shift = PAGE_SHIFT; - mem->page_mask = PAGE_SIZE - 1; + mr->page_shift = PAGE_SHIFT; + mr->page_mask = PAGE_SIZE - 1; num_buf = 0; - map = mem->map; + map = mr->map; if (length > 0) { buf = map[0]->buf; @@ -190,15 +190,15 @@ int rxe_mem_init_user(struct rxe_pd *pd, u64 start, } } - mem->pd = pd; - mem->umem = umem; - mem->access = access; - mem->length = length; - mem->iova = iova; - mem->va = start; - mem->offset = ib_umem_offset(umem); - mem->state = RXE_MEM_STATE_VALID; - mem->type = RXE_MEM_TYPE_MR; + mr->pd = pd; + mr->umem = umem; + mr->access = access; + mr->length = length; + mr->iova = iova; + mr->va = start; + mr->offset = ib_umem_offset(umem); + mr->state = RXE_MEM_STATE_VALID; + mr->type = RXE_MR_TYPE_MR; return 0; @@ -206,24 +206,24 @@ int rxe_mem_init_user(struct rxe_pd *pd, u64 start, return err; } -int rxe_mem_init_fast(struct rxe_pd *pd, - int max_pages, struct rxe_mem *mem) +int rxe_mr_init_fast(struct rxe_pd *pd, + int max_pages, struct rxe_mr *mr) { int err; - rxe_mem_init(0, mem); + rxe_mr_init(0, mr); /* In fastreg, we also set the rkey */ - mem->ibmr.rkey = mem->ibmr.lkey; + mr->ibmr.rkey = mr->ibmr.lkey; - err = rxe_mem_alloc(mem, max_pages); + err = rxe_mr_alloc(mr, max_pages); if (err) goto err1; - mem->pd = pd; - mem->max_buf = max_pages; - mem->state = RXE_MEM_STATE_FREE; - mem->type = RXE_MEM_TYPE_MR; + mr->pd = pd; + mr->max_buf = max_pages; + mr->state = RXE_MEM_STATE_FREE; + mr->type = RXE_MR_TYPE_MR; return 0; @@ -232,27 +232,27 @@ int rxe_mem_init_fast(struct rxe_pd *pd, } static void lookup_iova( - struct rxe_mem *mem, + struct rxe_mr *mr, u64 iova, int *m_out, int *n_out, size_t *offset_out) { - size_t offset = iova - mem->iova + mem->offset; + size_t offset = iova - mr->iova + mr->offset; int map_index; int buf_index; u64 length; - if (likely(mem->page_shift)) { - *offset_out = offset & mem->page_mask; - offset >>= mem->page_shift; - *n_out = offset & mem->map_mask; - *m_out = offset >> mem->map_shift; + if (likely(mr->page_shift)) { + *offset_out = offset & mr->page_mask; + offset >>= mr->page_shift; + *n_out = offset & mr->map_mask; + *m_out = offset >> mr->map_shift; } else { map_index = 0; buf_index = 0; - length = mem->map[map_index]->buf[buf_index].size; + length = mr->map[map_index]->buf[buf_index].size; while (offset >= length) { offset -= length; @@ -262,7 +262,7 @@ static void lookup_iova( map_index++; buf_index = 0; } - length = mem->map[map_index]->buf[buf_index].size; + length = mr->map[map_index]->buf[buf_index].size; } *m_out = map_index; @@ -271,48 +271,48 @@ static void lookup_iova( } } -void *iova_to_vaddr(struct rxe_mem *mem, u64 iova, int length) +void *iova_to_vaddr(struct rxe_mr *mr, u64 iova, int length) { size_t offset; int m, n; void *addr; - if (mem->state != RXE_MEM_STATE_VALID) { - pr_warn("mem not in valid state\n"); + if (mr->state != RXE_MEM_STATE_VALID) { + pr_warn("mr not in valid state\n"); addr = NULL; goto out; } - if (!mem->map) { + if (!mr->map) { addr = (void *)(uintptr_t)iova; goto out; } - if (mem_check_range(mem, iova, length)) { + if (mr_check_range(mr, iova, length)) { pr_warn("range violation\n"); addr = NULL; goto out; } - lookup_iova(mem, iova, &m, &n, &offset); + lookup_iova(mr, iova, &m, &n, &offset); - if (offset + length > mem->map[m]->buf[n].size) { + if (offset + length > mr->map[m]->buf[n].size) { pr_warn("crosses page boundary\n"); addr = NULL; goto out; } - addr = (void *)(uintptr_t)mem->map[m]->buf[n].addr + offset; + addr = (void *)(uintptr_t)mr->map[m]->buf[n].addr + offset; out: return addr; } /* copy data from a range (vaddr, vaddr+length-1) to or from - * a mem object starting at iova. Compute incremental value of - * crc32 if crcp is not zero. caller must hold a reference to mem + * a mr object starting at iova. Compute incremental value of + * crc32 if crcp is not zero. caller must hold a reference to mr */ -int rxe_mem_copy(struct rxe_mem *mem, u64 iova, void *addr, int length, +int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length, enum copy_direction dir, u32 *crcp) { int err; @@ -328,43 +328,43 @@ int rxe_mem_copy(struct rxe_mem *mem, u64 iova, void *addr, int length, if (length == 0) return 0; - if (mem->type == RXE_MEM_TYPE_DMA) { + if (mr->type == RXE_MR_TYPE_DMA) { u8 *src, *dest; - src = (dir == to_mem_obj) ? + src = (dir == to_mr_obj) ? addr : ((void *)(uintptr_t)iova); - dest = (dir == to_mem_obj) ? + dest = (dir == to_mr_obj) ? ((void *)(uintptr_t)iova) : addr; memcpy(dest, src, length); if (crcp) - *crcp = rxe_crc32(to_rdev(mem->pd->ibpd.device), + *crcp = rxe_crc32(to_rdev(mr->pd->ibpd.device), *crcp, dest, length); return 0; } - WARN_ON_ONCE(!mem->map); + WARN_ON_ONCE(!mr->map); - err = mem_check_range(mem, iova, length); + err = mr_check_range(mr, iova, length); if (err) { err = -EFAULT; goto err1; } - lookup_iova(mem, iova, &m, &i, &offset); + lookup_iova(mr, iova, &m, &i, &offset); - map = mem->map + m; + map = mr->map + m; buf = map[0]->buf + i; while (length > 0) { u8 *src, *dest; va = (u8 *)(uintptr_t)buf->addr + offset; - src = (dir == to_mem_obj) ? addr : va; - dest = (dir == to_mem_obj) ? va : addr; + src = (dir == to_mr_obj) ? addr : va; + dest = (dir == to_mr_obj) ? va : addr; bytes = buf->size - offset; @@ -374,7 +374,7 @@ int rxe_mem_copy(struct rxe_mem *mem, u64 iova, void *addr, int length, memcpy(dest, src, bytes); if (crcp) - crc = rxe_crc32(to_rdev(mem->pd->ibpd.device), + crc = rxe_crc32(to_rdev(mr->pd->ibpd.device), crc, dest, bytes); length -= bytes; @@ -416,7 +416,7 @@ int copy_data( struct rxe_sge *sge = &dma->sge[dma->cur_sge]; int offset = dma->sge_offset; int resid = dma->resid; - struct rxe_mem *mem = NULL; + struct rxe_mr *mr = NULL; u64 iova; int err; @@ -429,8 +429,8 @@ int copy_data( } if (sge->length && (offset < sge->length)) { - mem = lookup_mem(pd, access, sge->lkey, lookup_local); - if (!mem) { + mr = lookup_mr(pd, access, sge->lkey, lookup_local); + if (!mr) { err = -EINVAL; goto err1; } @@ -440,9 +440,9 @@ int copy_data( bytes = length; if (offset >= sge->length) { - if (mem) { - rxe_drop_ref(mem); - mem = NULL; + if (mr) { + rxe_drop_ref(mr); + mr = NULL; } sge++; dma->cur_sge++; @@ -454,9 +454,9 @@ int copy_data( } if (sge->length) { - mem = lookup_mem(pd, access, sge->lkey, + mr = lookup_mr(pd, access, sge->lkey, lookup_local); - if (!mem) { + if (!mr) { err = -EINVAL; goto err1; } @@ -471,7 +471,7 @@ int copy_data( if (bytes > 0) { iova = sge->addr + offset; - err = rxe_mem_copy(mem, iova, addr, bytes, dir, crcp); + err = rxe_mr_copy(mr, iova, addr, bytes, dir, crcp); if (err) goto err2; @@ -485,14 +485,14 @@ int copy_data( dma->sge_offset = offset; dma->resid = resid; - if (mem) - rxe_drop_ref(mem); + if (mr) + rxe_drop_ref(mr); return 0; err2: - if (mem) - rxe_drop_ref(mem); + if (mr) + rxe_drop_ref(mr); err1: return err; } @@ -530,31 +530,31 @@ int advance_dma_data(struct rxe_dma_info *dma, unsigned int length) return 0; } -/* (1) find the mem (mr or mw) corresponding to lkey/rkey +/* (1) find the mr corresponding to lkey/rkey * depending on lookup_type - * (2) verify that the (qp) pd matches the mem pd - * (3) verify that the mem can support the requested access - * (4) verify that mem state is valid + * (2) verify that the (qp) pd matches the mr pd + * (3) verify that the mr can support the requested access + * (4) verify that mr state is valid */ -struct rxe_mem *lookup_mem(struct rxe_pd *pd, int access, u32 key, +struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key, enum lookup_type type) { - struct rxe_mem *mem; + struct rxe_mr *mr; struct rxe_dev *rxe = to_rdev(pd->ibpd.device); int index = key >> 8; - mem = rxe_pool_get_index(&rxe->mr_pool, index); - if (!mem) + mr = rxe_pool_get_index(&rxe->mr_pool, index); + if (!mr) return NULL; - if (unlikely((type == lookup_local && mem->lkey != key) || - (type == lookup_remote && mem->rkey != key) || - mem->pd != pd || - (access && !(access & mem->access)) || - mem->state != RXE_MEM_STATE_VALID)) { - rxe_drop_ref(mem); - mem = NULL; + if (unlikely((type == lookup_local && mr->lkey != key) || + (type == lookup_remote && mr->rkey != key) || + mr->pd != pd || + (access && !(access & mr->access)) || + mr->state != RXE_MEM_STATE_VALID)) { + rxe_drop_ref(mr); + mr = NULL; } - return mem; + return mr; } diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c index b374eb53e2fe..32ba47d143f3 100644 --- a/drivers/infiniband/sw/rxe/rxe_pool.c +++ b/drivers/infiniband/sw/rxe/rxe_pool.c @@ -8,8 +8,6 @@ #include "rxe_loc.h" /* info about object pools - * note that mr and mw share a single index space - * so that one can map an lkey to the correct type of object */ struct rxe_type_info rxe_type_info[RXE_NUM_TYPES] = { [RXE_TYPE_UC] = { @@ -50,15 +48,15 @@ struct rxe_type_info rxe_type_info[RXE_NUM_TYPES] = { }, [RXE_TYPE_MR] = { .name = "rxe-mr", - .size = sizeof(struct rxe_mem), - .cleanup = rxe_mem_cleanup, + .size = sizeof(struct rxe_mr), + .cleanup = rxe_mr_cleanup, .flags = RXE_POOL_INDEX, .max_index = RXE_MAX_MR_INDEX, .min_index = RXE_MIN_MR_INDEX, }, [RXE_TYPE_MW] = { .name = "rxe-mw", - .size = sizeof(struct rxe_mem), + .size = sizeof(struct rxe_mw), .flags = RXE_POOL_INDEX, .max_index = RXE_MAX_MW_INDEX, .min_index = RXE_MIN_MW_INDEX, diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c index e27585ce9eb7..57236d8c2146 100644 --- a/drivers/infiniband/sw/rxe/rxe_req.c +++ b/drivers/infiniband/sw/rxe/rxe_req.c @@ -465,7 +465,7 @@ static int fill_packet(struct rxe_qp *qp, struct rxe_send_wqe *wqe, } else { err = copy_data(qp->pd, 0, &wqe->dma, payload_addr(pkt), paylen, - from_mem_obj, + from_mr_obj, &crc); if (err) return err; @@ -597,7 +597,7 @@ int rxe_requester(void *arg) if (wqe->mask & WR_REG_MASK) { if (wqe->wr.opcode == IB_WR_LOCAL_INV) { struct rxe_dev *rxe = to_rdev(qp->ibqp.device); - struct rxe_mem *rmr; + struct rxe_mr *rmr; rmr = rxe_pool_get_index(&rxe->mr_pool, wqe->wr.ex.invalidate_rkey >> 8); @@ -613,7 +613,7 @@ int rxe_requester(void *arg) wqe->state = wqe_state_done; wqe->status = IB_WC_SUCCESS; } else if (wqe->wr.opcode == IB_WR_REG_MR) { - struct rxe_mem *rmr = to_rmr(wqe->wr.wr.reg.mr); + struct rxe_mr *rmr = to_rmr(wqe->wr.wr.reg.mr); rmr->state = RXE_MEM_STATE_VALID; rmr->access = wqe->wr.wr.reg.access; diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c index c7e3b6a4af38..69867bf39cfb 100644 --- a/drivers/infiniband/sw/rxe/rxe_resp.c +++ b/drivers/infiniband/sw/rxe/rxe_resp.c @@ -390,7 +390,7 @@ static enum resp_states check_length(struct rxe_qp *qp, static enum resp_states check_rkey(struct rxe_qp *qp, struct rxe_pkt_info *pkt) { - struct rxe_mem *mem = NULL; + struct rxe_mr *mr = NULL; u64 va; u32 rkey; u32 resid; @@ -429,18 +429,18 @@ static enum resp_states check_rkey(struct rxe_qp *qp, resid = qp->resp.resid; pktlen = payload_size(pkt); - mem = lookup_mem(qp->pd, access, rkey, lookup_remote); - if (!mem) { + mr = lookup_mr(qp->pd, access, rkey, lookup_remote); + if (!mr) { state = RESPST_ERR_RKEY_VIOLATION; goto err; } - if (unlikely(mem->state == RXE_MEM_STATE_FREE)) { + if (unlikely(mr->state == RXE_MEM_STATE_FREE)) { state = RESPST_ERR_RKEY_VIOLATION; goto err; } - if (mem_check_range(mem, va, resid)) { + if (mr_check_range(mr, va, resid)) { state = RESPST_ERR_RKEY_VIOLATION; goto err; } @@ -468,12 +468,12 @@ static enum resp_states check_rkey(struct rxe_qp *qp, WARN_ON_ONCE(qp->resp.mr); - qp->resp.mr = mem; + qp->resp.mr = mr; return RESPST_EXECUTE; err: - if (mem) - rxe_drop_ref(mem); + if (mr) + rxe_drop_ref(mr); return state; } @@ -483,7 +483,7 @@ static enum resp_states send_data_in(struct rxe_qp *qp, void *data_addr, int err; err = copy_data(qp->pd, IB_ACCESS_LOCAL_WRITE, &qp->resp.wqe->dma, - data_addr, data_len, to_mem_obj, NULL); + data_addr, data_len, to_mr_obj, NULL); if (unlikely(err)) return (err == -ENOSPC) ? RESPST_ERR_LENGTH : RESPST_ERR_MALFORMED_WQE; @@ -498,8 +498,8 @@ static enum resp_states write_data_in(struct rxe_qp *qp, int err; int data_len = payload_size(pkt); - err = rxe_mem_copy(qp->resp.mr, qp->resp.va, payload_addr(pkt), - data_len, to_mem_obj, NULL); + err = rxe_mr_copy(qp->resp.mr, qp->resp.va, payload_addr(pkt), + data_len, to_mr_obj, NULL); if (err) { rc = RESPST_ERR_RKEY_VIOLATION; goto out; @@ -521,7 +521,7 @@ static enum resp_states process_atomic(struct rxe_qp *qp, u64 iova = atmeth_va(pkt); u64 *vaddr; enum resp_states ret; - struct rxe_mem *mr = qp->resp.mr; + struct rxe_mr *mr = qp->resp.mr; if (mr->state != RXE_MEM_STATE_VALID) { ret = RESPST_ERR_RKEY_VIOLATION; @@ -700,8 +700,8 @@ static enum resp_states read_reply(struct rxe_qp *qp, if (!skb) return RESPST_ERR_RNR; - err = rxe_mem_copy(res->read.mr, res->read.va, payload_addr(&ack_pkt), - payload, from_mem_obj, &icrc); + err = rxe_mr_copy(res->read.mr, res->read.va, payload_addr(&ack_pkt), + payload, from_mr_obj, &icrc); if (err) pr_err("Failed copying memory\n"); @@ -883,7 +883,7 @@ static enum resp_states do_complete(struct rxe_qp *qp, } if (pkt->mask & RXE_IETH_MASK) { - struct rxe_mem *rmr; + struct rxe_mr *rmr; wc->wc_flags |= IB_WC_WITH_INVALIDATE; wc->ex.invalidate_rkey = ieth_rkey(pkt); diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c index 36edc294e105..0f4c7d2f743a 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.c +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c @@ -863,7 +863,7 @@ static struct ib_mr *rxe_get_dma_mr(struct ib_pd *ibpd, int access) { struct rxe_dev *rxe = to_rdev(ibpd->device); struct rxe_pd *pd = to_rpd(ibpd); - struct rxe_mem *mr; + struct rxe_mr *mr; mr = rxe_alloc(&rxe->mr_pool); if (!mr) @@ -871,7 +871,7 @@ static struct ib_mr *rxe_get_dma_mr(struct ib_pd *ibpd, int access) rxe_add_index(mr); rxe_add_ref(pd); - rxe_mem_init_dma(pd, access, mr); + rxe_mr_init_dma(pd, access, mr); return &mr->ibmr; } @@ -885,7 +885,7 @@ static struct ib_mr *rxe_reg_user_mr(struct ib_pd *ibpd, int err; struct rxe_dev *rxe = to_rdev(ibpd->device); struct rxe_pd *pd = to_rpd(ibpd); - struct rxe_mem *mr; + struct rxe_mr *mr; mr = rxe_alloc(&rxe->mr_pool); if (!mr) { @@ -897,7 +897,7 @@ static struct ib_mr *rxe_reg_user_mr(struct ib_pd *ibpd, rxe_add_ref(pd); - err = rxe_mem_init_user(pd, start, length, iova, + err = rxe_mr_init_user(pd, start, length, iova, access, udata, mr); if (err) goto err3; @@ -914,7 +914,7 @@ static struct ib_mr *rxe_reg_user_mr(struct ib_pd *ibpd, static int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) { - struct rxe_mem *mr = to_rmr(ibmr); + struct rxe_mr *mr = to_rmr(ibmr); mr->state = RXE_MEM_STATE_ZOMBIE; rxe_drop_ref(mr->pd); @@ -928,7 +928,7 @@ static struct ib_mr *rxe_alloc_mr(struct ib_pd *ibpd, enum ib_mr_type mr_type, { struct rxe_dev *rxe = to_rdev(ibpd->device); struct rxe_pd *pd = to_rpd(ibpd); - struct rxe_mem *mr; + struct rxe_mr *mr; int err; if (mr_type != IB_MR_TYPE_MEM_REG) @@ -944,7 +944,7 @@ static struct ib_mr *rxe_alloc_mr(struct ib_pd *ibpd, enum ib_mr_type mr_type, rxe_add_ref(pd); - err = rxe_mem_init_fast(pd, max_num_sg, mr); + err = rxe_mr_init_fast(pd, max_num_sg, mr); if (err) goto err2; @@ -960,7 +960,7 @@ static struct ib_mr *rxe_alloc_mr(struct ib_pd *ibpd, enum ib_mr_type mr_type, static int rxe_set_page(struct ib_mr *ibmr, u64 addr) { - struct rxe_mem *mr = to_rmr(ibmr); + struct rxe_mr *mr = to_rmr(ibmr); struct rxe_map *map; struct rxe_phys_buf *buf; @@ -980,7 +980,7 @@ static int rxe_set_page(struct ib_mr *ibmr, u64 addr) static int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg, int sg_nents, unsigned int *sg_offset) { - struct rxe_mem *mr = to_rmr(ibmr); + struct rxe_mr *mr = to_rmr(ibmr); int n; mr->nbuf = 0; diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h index 560a610bb0aa..dbc649c9c43f 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.h +++ b/drivers/infiniband/sw/rxe/rxe_verbs.h @@ -39,7 +39,7 @@ struct rxe_ucontext { }; struct rxe_pd { - struct ib_pd ibpd; + struct ib_pd ibpd; struct rxe_pool_entry pelem; }; @@ -156,7 +156,7 @@ struct resp_res { struct sk_buff *skb; } atomic; struct { - struct rxe_mem *mr; + struct rxe_mr *mr; u64 va_org; u32 rkey; u32 length; @@ -183,7 +183,7 @@ struct rxe_resp_info { /* RDMA read / atomic only */ u64 va; - struct rxe_mem *mr; + struct rxe_mr *mr; u32 resid; u32 rkey; u32 length; @@ -269,31 +269,27 @@ enum rxe_mem_state { RXE_MEM_STATE_VALID, }; -enum rxe_mem_type { - RXE_MEM_TYPE_NONE, - RXE_MEM_TYPE_DMA, - RXE_MEM_TYPE_MR, - RXE_MEM_TYPE_FMR, - RXE_MEM_TYPE_MW, +enum rxe_mr_type { + RXE_MR_TYPE_NONE, + RXE_MR_TYPE_DMA, + RXE_MR_TYPE_MR, + RXE_MR_TYPE_FMR, }; #define RXE_BUF_PER_MAP (PAGE_SIZE / sizeof(struct rxe_phys_buf)) struct rxe_phys_buf { - u64 addr; - u64 size; + u64 addr; + u64 size; }; struct rxe_map { struct rxe_phys_buf buf[RXE_BUF_PER_MAP]; }; -struct rxe_mem { +struct rxe_mr { struct rxe_pool_entry pelem; - union { - struct ib_mr ibmr; - struct ib_mw ibmw; - }; + struct ib_mr ibmr; struct rxe_pd *pd; struct ib_umem *umem; @@ -302,7 +298,7 @@ struct rxe_mem { u32 rkey; enum rxe_mem_state state; - enum rxe_mem_type type; + enum rxe_mr_type type; u64 va; u64 iova; size_t length; @@ -323,6 +319,18 @@ struct rxe_mem { struct rxe_map **map; }; +struct rxe_mw { + struct rxe_pool_entry pelem; + struct ib_mw ibmw; + struct rxe_qp *qp; /* type 2B only */ + struct rxe_mr *mr; + spinlock_t lock; + enum rxe_mem_state state; + u32 access; + u64 addr; + u64 length; +}; + struct rxe_mc_grp { struct rxe_pool_entry pelem; spinlock_t mcg_lock; /* guard group */ @@ -428,14 +436,9 @@ static inline struct rxe_cq *to_rcq(struct ib_cq *cq) return cq ? container_of(cq, struct rxe_cq, ibcq) : NULL; } -static inline struct rxe_mem *to_rmr(struct ib_mr *mr) -{ - return mr ? container_of(mr, struct rxe_mem, ibmr) : NULL; -} - -static inline struct rxe_mem *to_rmw(struct ib_mw *mw) +static inline struct rxe_mr *to_rmr(struct ib_mr *mr) { - return mw ? container_of(mw, struct rxe_mem, ibmw) : NULL; + return mr ? container_of(mr, struct rxe_mr, ibmr) : NULL; } int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name); From patchwork Thu Sep 3 22:40:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bob Pearson X-Patchwork-Id: 11755457 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 79075618 for ; Thu, 3 Sep 2020 22:41:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 54B9D20786 for ; Thu, 3 Sep 2020 22:41:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="qsB1z7ab" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728775AbgICWlz (ORCPT ); Thu, 3 Sep 2020 18:41:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55690 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729226AbgICWlo (ORCPT ); Thu, 3 Sep 2020 18:41:44 -0400 Received: from mail-ot1-x343.google.com (mail-ot1-x343.google.com [IPv6:2607:f8b0:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DC067C061244 for ; Thu, 3 Sep 2020 15:41:42 -0700 (PDT) Received: by mail-ot1-x343.google.com with SMTP id a65so4213781otc.8 for ; Thu, 03 Sep 2020 15:41:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=swjhoPychMlfdrobhEzYkeT+tWWpVO8N0DpdxGGbEs4=; b=qsB1z7abkOl9Ghn2XH2acW/NI68GWkvvmVpWxJ7b1MnHcbBbhAn3q8asY0kXYFhbnj zIbHZn//TB58cWVUohsMzkJJc5U59pxLrAxBWfFaFXdgi+TpLSRu05iceVtTgGG/dElu ADQt5VNZR0xCGa5ODH5Zg5Ci22WYKWMHAWJuUmGHif8j3E2HiYHNmNQNB9GmT/eQQFsm 2Kua1zG+3uAtF1D/iLUwtISJ3JXgtrFjXv7KuzIAeZHr+eooy23bkE+pDzlhSkuE/wdH rZlpBq7mosuQm3IY8H1/rYdExP9kdUXyxedQoWjbIZnwJ1FIiLMOM1mwCnNQ2ieDwPKu 0W8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=swjhoPychMlfdrobhEzYkeT+tWWpVO8N0DpdxGGbEs4=; b=H6o2E6JT9jRabJsYNozAQANYmO9xHO5Hb/2atUBDvwNSO7u9eC/l+/244enG1gM7bV P2sX77/Hd1pWvTlLD/6iC3XnJ6F9zbdRKSxX6ZZJplzASf+YYiD90hwysjjGREBdpVSZ GP6cAi58nxPAJedtYosxBi7B6UbfAUI1rByTtt9ICkKg4Ce2k8AX9Cly/uWnyN1OA023 dzVlTIrIDJFKBKU4R6AEW88engrkDgIW6q9m0yHsEoBMs9wZ4pbvuobwWoq1V+dISFYd DtZd/3Tz4rVmDHXq7vP8hNwUpewyvvuCIJQEG+BZEMtdMZ0maQhzpA83ILaIDHkxniXk KRpA== X-Gm-Message-State: AOAM532WnuLUDd9eL9xakQXsVfAQ7NNvmrnS/l5pFkaJmwfktw7MoRoC 7NzwT06lasI9aqBWKNlqRvI= X-Google-Smtp-Source: ABdhPJyhSkJVfPBG2sbdfhYKqLa/EdoS858aJI0BXLMh8TEde1RQkyyoVy3J5V8CufmT6daAMFyC2Q== X-Received: by 2002:a05:6830:2302:: with SMTP id u2mr3242347ote.181.1599172902331; Thu, 03 Sep 2020 15:41:42 -0700 (PDT) Received: from localhost ([2605:6000:8b03:f000:6a3a:fc5c:851c:306a]) by smtp.gmail.com with ESMTPSA id v35sm818658otb.32.2020.09.03.15.41.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Sep 2020 15:41:42 -0700 (PDT) From: Bob Pearson X-Google-Original-From: Bob Pearson To: jgg@nvidia.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org Cc: Bob Pearson Subject: [PATCH v4 for-next 3/7] rdma_rxe: enabled MW objects Date: Thu, 3 Sep 2020 17:40:36 -0500 Message-Id: <20200903224039.437391-4-rpearson@hpe.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200903224039.437391-1-rpearson@hpe.com> References: <20200903224039.437391-1-rpearson@hpe.com> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Changed parameters in rxe_param.h so that MAX_MW is the same as MAX_MR. Set device attribute in rxe.c so max_mw = MAX_MW. Signed-off-by: Bob Pearson --- drivers/infiniband/sw/rxe/rxe.c | 1 + drivers/infiniband/sw/rxe/rxe_param.h | 10 ++++++---- 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe.c b/drivers/infiniband/sw/rxe/rxe.c index 43b327b53e26..fab291245366 100644 --- a/drivers/infiniband/sw/rxe/rxe.c +++ b/drivers/infiniband/sw/rxe/rxe.c @@ -52,6 +52,7 @@ static void rxe_init_device_param(struct rxe_dev *rxe) rxe->attr.max_cq = RXE_MAX_CQ; rxe->attr.max_cqe = (1 << RXE_MAX_LOG_CQE) - 1; rxe->attr.max_mr = RXE_MAX_MR; + rxe->attr.max_mw = RXE_MAX_MW; rxe->attr.max_pd = RXE_MAX_PD; rxe->attr.max_qp_rd_atom = RXE_MAX_QP_RD_ATOM; rxe->attr.max_res_rd_atom = RXE_MAX_RES_RD_ATOM; diff --git a/drivers/infiniband/sw/rxe/rxe_param.h b/drivers/infiniband/sw/rxe/rxe_param.h index 25ab50d9b7c2..4ebb3da8c07d 100644 --- a/drivers/infiniband/sw/rxe/rxe_param.h +++ b/drivers/infiniband/sw/rxe/rxe_param.h @@ -58,7 +58,8 @@ enum rxe_device_param { RXE_MAX_SGE_RD = 32, RXE_MAX_CQ = 16384, RXE_MAX_LOG_CQE = 15, - RXE_MAX_MR = 256 * 1024, + RXE_MAX_MR = 0x40000, + RXE_MAX_MW = 0x40000, RXE_MAX_PD = 0x7ffc, RXE_MAX_QP_RD_ATOM = 128, RXE_MAX_RES_RD_ATOM = 0x3f000, @@ -87,9 +88,10 @@ enum rxe_device_param { RXE_MAX_SRQ_INDEX = 0x00040000, RXE_MIN_MR_INDEX = 0x00000001, - RXE_MAX_MR_INDEX = 0x00040000, - RXE_MIN_MW_INDEX = 0x00040001, - RXE_MAX_MW_INDEX = 0x00060000, + RXE_MAX_MR_INDEX = RXE_MIN_MR_INDEX + RXE_MAX_MR - 1, + RXE_MIN_MW_INDEX = RXE_MIN_MR_INDEX + RXE_MAX_MR, + RXE_MAX_MW_INDEX = RXE_MIN_MW_INDEX + RXE_MAX_MW - 1, + RXE_MAX_PKT_PER_ACK = 64, RXE_MAX_UNACKED_PSNS = 128, From patchwork Thu Sep 3 22:40:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bob Pearson X-Patchwork-Id: 11755453 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D7786746 for ; Thu, 3 Sep 2020 22:41:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B075520716 for ; Thu, 3 Sep 2020 22:41:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Ub1UvgQF" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729336AbgICWlv (ORCPT ); Thu, 3 Sep 2020 18:41:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55692 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729278AbgICWlo (ORCPT ); Thu, 3 Sep 2020 18:41:44 -0400 Received: from mail-ot1-x343.google.com (mail-ot1-x343.google.com [IPv6:2607:f8b0:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6758C061245 for ; Thu, 3 Sep 2020 15:41:43 -0700 (PDT) Received: by mail-ot1-x343.google.com with SMTP id 37so4237141oto.4 for ; Thu, 03 Sep 2020 15:41:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=t8H3VytHyxvjTfQWg0Aw6qoWywji1fUQgxsQHAwObQg=; b=Ub1UvgQFUNoVYTjBaMcsYe5IKmZi8bewSeZI4JYUd4UVAmN11BwafsP06qjOadhkT5 saxxpNWIfNMqiJbGp2UR7NBspPQvszOVPtuhdag6/3+6Da5WY+YxV0ckuriJdETg1xSg a8wrqnqvCRsZH0vkXdfIURQ3NpAUsp6dz/OLn23oFsNSktL4bb7WltHEcQDSIXwmgMJp zbxYANHTEtDz9Cky6x0Fa5DFMK2d4nsREkGRewYRPTAjTvJpwxU3zXd/Iqd6IjubI9N/ lZ0qwbaq1acbGTewQkZActuVI9tNcwnqol0Tp1o7j+lujUqT264dcpKwSEzVsVNlvD3f KgcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=t8H3VytHyxvjTfQWg0Aw6qoWywji1fUQgxsQHAwObQg=; b=hk962mOX5VmzY0kphxBz0u1wMnzIYB+avzX68CWy6hdXMoVrlev+jI23IIxY6jdPEM TpmLsBKW77iuEG/Aor1YYeLFqmp/DB7jWzNDExWxgTqR98ujCcCPwDyVexCR+QECAin2 VWo70CQUTKQZkWz+uFSYeZ0R2aXuiwKTAYFJeVGXfEQhOWqa+HAQHO8arCfmzEQpOyoj pIioyauspvSqQifaI10j6dHmtr+8PlZEnwdluxwJ5fuaCSe72BtdqzKUSqsBnPVt++s+ xsebET3MT4fmYFtEqTdqdAh8UV8WpFd7xJr3n38nhDTirawnaaRm+TV0OkePXdB1A9Ko tlaQ== X-Gm-Message-State: AOAM531N1oegCSF90vHx0k0bM7OeLKtoml4oA5/tGck9JKiW6kOxjtn4 mINAPsoW6wDY1DCAPD1nmUk= X-Google-Smtp-Source: ABdhPJwOt8seh51J4FVYM6AejHmosSDvdr1C8PARCk7oqFzFo47StsagRLxAd2e9MZkMq3Yfab04Vg== X-Received: by 2002:a05:6830:154f:: with SMTP id l15mr3273715otp.21.1599172903056; Thu, 03 Sep 2020 15:41:43 -0700 (PDT) Received: from localhost ([2605:6000:8b03:f000:6a3a:fc5c:851c:306a]) by smtp.gmail.com with ESMTPSA id f20sm813961otq.80.2020.09.03.15.41.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Sep 2020 15:41:42 -0700 (PDT) From: Bob Pearson X-Google-Original-From: Bob Pearson To: jgg@nvidia.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org Cc: Bob Pearson Subject: [PATCH v4 for-next 4/7] rdma_rxe: Let pools support both keys and indices Date: Thu, 3 Sep 2020 17:40:37 -0500 Message-Id: <20200903224039.437391-5-rpearson@hpe.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200903224039.437391-1-rpearson@hpe.com> References: <20200903224039.437391-1-rpearson@hpe.com> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Allowed both indices and keys to exist for objects in pools. Previously you were limited to one or the other. This will support allowing the keys on MWs to change. Signed-off-by: Bob Pearson --- drivers/infiniband/sw/rxe/rxe_pool.c | 73 ++++++++++++++-------------- drivers/infiniband/sw/rxe/rxe_pool.h | 32 +++++++----- 2 files changed, 58 insertions(+), 47 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c index 32ba47d143f3..30b8f037ee20 100644 --- a/drivers/infiniband/sw/rxe/rxe_pool.c +++ b/drivers/infiniband/sw/rxe/rxe_pool.c @@ -92,18 +92,18 @@ static int rxe_pool_init_index(struct rxe_pool *pool, u32 max, u32 min) goto out; } - pool->max_index = max; - pool->min_index = min; + pool->index.max_index = max; + pool->index.min_index = min; size = BITS_TO_LONGS(max - min + 1) * sizeof(long); - pool->table = kmalloc(size, GFP_KERNEL); - if (!pool->table) { + pool->index.table = kmalloc(size, GFP_KERNEL); + if (!pool->index.table) { err = -ENOMEM; goto out; } - pool->table_size = size; - bitmap_zero(pool->table, max - min + 1); + pool->index.table_size = size; + bitmap_zero(pool->index.table, max - min + 1); out: return err; @@ -125,7 +125,8 @@ int rxe_pool_init( pool->max_elem = max_elem; pool->elem_size = ALIGN(size, RXE_POOL_ALIGN); pool->flags = rxe_type_info[type].flags; - pool->tree = RB_ROOT; + pool->index.tree = RB_ROOT; + pool->key.tree = RB_ROOT; pool->cleanup = rxe_type_info[type].cleanup; atomic_set(&pool->num_elem, 0); @@ -143,8 +144,8 @@ int rxe_pool_init( } if (rxe_type_info[type].flags & RXE_POOL_KEY) { - pool->key_offset = rxe_type_info[type].key_offset; - pool->key_size = rxe_type_info[type].key_size; + pool->key.key_offset = rxe_type_info[type].key_offset; + pool->key.key_size = rxe_type_info[type].key_size; } pool->state = RXE_POOL_STATE_VALID; @@ -158,7 +159,7 @@ static void rxe_pool_release(struct kref *kref) struct rxe_pool *pool = container_of(kref, struct rxe_pool, ref_cnt); pool->state = RXE_POOL_STATE_INVALID; - kfree(pool->table); + kfree(pool->index.table); } static void rxe_pool_put(struct rxe_pool *pool) @@ -183,27 +184,27 @@ void rxe_pool_cleanup(struct rxe_pool *pool) static u32 alloc_index(struct rxe_pool *pool) { u32 index; - u32 range = pool->max_index - pool->min_index + 1; + u32 range = pool->index.max_index - pool->index.min_index + 1; - index = find_next_zero_bit(pool->table, range, pool->last); + index = find_next_zero_bit(pool->index.table, range, pool->index.last); if (index >= range) - index = find_first_zero_bit(pool->table, range); + index = find_first_zero_bit(pool->index.table, range); WARN_ON_ONCE(index >= range); - set_bit(index, pool->table); - pool->last = index; - return index + pool->min_index; + set_bit(index, pool->index.table); + pool->index.last = index; + return index + pool->index.min_index; } static void insert_index(struct rxe_pool *pool, struct rxe_pool_entry *new) { - struct rb_node **link = &pool->tree.rb_node; + struct rb_node **link = &pool->index.tree.rb_node; struct rb_node *parent = NULL; struct rxe_pool_entry *elem; while (*link) { parent = *link; - elem = rb_entry(parent, struct rxe_pool_entry, node); + elem = rb_entry(parent, struct rxe_pool_entry, index_node); if (elem->index == new->index) { pr_warn("element already exists!\n"); @@ -216,25 +217,25 @@ static void insert_index(struct rxe_pool *pool, struct rxe_pool_entry *new) link = &(*link)->rb_right; } - rb_link_node(&new->node, parent, link); - rb_insert_color(&new->node, &pool->tree); + rb_link_node(&new->index_node, parent, link); + rb_insert_color(&new->index_node, &pool->index.tree); out: return; } static void insert_key(struct rxe_pool *pool, struct rxe_pool_entry *new) { - struct rb_node **link = &pool->tree.rb_node; + struct rb_node **link = &pool->key.tree.rb_node; struct rb_node *parent = NULL; struct rxe_pool_entry *elem; int cmp; while (*link) { parent = *link; - elem = rb_entry(parent, struct rxe_pool_entry, node); + elem = rb_entry(parent, struct rxe_pool_entry, key_node); - cmp = memcmp((u8 *)elem + pool->key_offset, - (u8 *)new + pool->key_offset, pool->key_size); + cmp = memcmp((u8 *)elem + pool->key.key_offset, + (u8 *)new + pool->key.key_offset, pool->key.key_size); if (cmp == 0) { pr_warn("key already exists!\n"); @@ -247,8 +248,8 @@ static void insert_key(struct rxe_pool *pool, struct rxe_pool_entry *new) link = &(*link)->rb_right; } - rb_link_node(&new->node, parent, link); - rb_insert_color(&new->node, &pool->tree); + rb_link_node(&new->key_node, parent, link); + rb_insert_color(&new->key_node, &pool->key.tree); out: return; } @@ -260,7 +261,7 @@ void rxe_add_key(void *arg, void *key) unsigned long flags; write_lock_irqsave(&pool->pool_lock, flags); - memcpy((u8 *)elem + pool->key_offset, key, pool->key_size); + memcpy((u8 *)elem + pool->key.key_offset, key, pool->key.key_size); insert_key(pool, elem); write_unlock_irqrestore(&pool->pool_lock, flags); } @@ -272,7 +273,7 @@ void rxe_drop_key(void *arg) unsigned long flags; write_lock_irqsave(&pool->pool_lock, flags); - rb_erase(&elem->node, &pool->tree); + rb_erase(&elem->key_node, &pool->key.tree); write_unlock_irqrestore(&pool->pool_lock, flags); } @@ -295,8 +296,8 @@ void rxe_drop_index(void *arg) unsigned long flags; write_lock_irqsave(&pool->pool_lock, flags); - clear_bit(elem->index - pool->min_index, pool->table); - rb_erase(&elem->node, &pool->tree); + clear_bit(elem->index - pool->index.min_index, pool->index.table); + rb_erase(&elem->index_node, &pool->index.tree); write_unlock_irqrestore(&pool->pool_lock, flags); } @@ -400,10 +401,10 @@ void *rxe_pool_get_index(struct rxe_pool *pool, u32 index) if (pool->state != RXE_POOL_STATE_VALID) goto out; - node = pool->tree.rb_node; + node = pool->index.tree.rb_node; while (node) { - elem = rb_entry(node, struct rxe_pool_entry, node); + elem = rb_entry(node, struct rxe_pool_entry, index_node); if (elem->index > index) node = node->rb_left; @@ -432,13 +433,13 @@ void *rxe_pool_get_key(struct rxe_pool *pool, void *key) if (pool->state != RXE_POOL_STATE_VALID) goto out; - node = pool->tree.rb_node; + node = pool->key.tree.rb_node; while (node) { - elem = rb_entry(node, struct rxe_pool_entry, node); + elem = rb_entry(node, struct rxe_pool_entry, key_node); - cmp = memcmp((u8 *)elem + pool->key_offset, - key, pool->key_size); + cmp = memcmp((u8 *)elem + pool->key.key_offset, + key, pool->key.key_size); if (cmp > 0) node = node->rb_left; diff --git a/drivers/infiniband/sw/rxe/rxe_pool.h b/drivers/infiniband/sw/rxe/rxe_pool.h index 432745ffc8d4..3d722aae5f15 100644 --- a/drivers/infiniband/sw/rxe/rxe_pool.h +++ b/drivers/infiniband/sw/rxe/rxe_pool.h @@ -56,8 +56,11 @@ struct rxe_pool_entry { struct kref ref_cnt; struct list_head list; - /* only used if indexed or keyed */ - struct rb_node node; + /* only used if keyed */ + struct rb_node key_node; + + /* only used if indexed */ + struct rb_node index_node; u32 index; }; @@ -74,15 +77,22 @@ struct rxe_pool { unsigned int max_elem; atomic_t num_elem; - /* only used if indexed or keyed */ - struct rb_root tree; - unsigned long *table; - size_t table_size; - u32 max_index; - u32 min_index; - u32 last; - size_t key_offset; - size_t key_size; + /* only used if indexed */ + struct { + struct rb_root tree; + unsigned long *table; + size_t table_size; + u32 last; + u32 max_index; + u32 min_index; + } index; + + /* only used if keyed */ + struct { + struct rb_root tree; + size_t key_offset; + size_t key_size; + } key; }; /* initialize a pool of objects with given limit on From patchwork Thu Sep 3 22:40:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bob Pearson X-Patchwork-Id: 11755459 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4DBBA618 for ; Thu, 3 Sep 2020 22:41:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1E14F20786 for ; Thu, 3 Sep 2020 22:41:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lQaFr4JJ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729226AbgICWlz (ORCPT ); Thu, 3 Sep 2020 18:41:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55694 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729306AbgICWlo (ORCPT ); Thu, 3 Sep 2020 18:41:44 -0400 Received: from mail-ot1-x343.google.com (mail-ot1-x343.google.com [IPv6:2607:f8b0:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7DA03C061247 for ; Thu, 3 Sep 2020 15:41:44 -0700 (PDT) Received: by mail-ot1-x343.google.com with SMTP id v16so4208044otp.10 for ; Thu, 03 Sep 2020 15:41:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=SouTugQU/Mxi+G8oZbVU1lsOLTeuvLYgTsICg1TwYFo=; b=lQaFr4JJO0YM0FmzjN0maV1b8taYahrhADOHgjERAErDeTt5iZQ7uA1CcuuZZChpsY mmAUFfZp/ObTz+oaga7GN6Bt0ikByu3TrY9rBlOQZEkWZh7nIVfYqTuoDWjZjgWK2aLI U021iJbiuMwx4PYttaLmDWb54MFUfcg64yaY/Rn5gQY3Ciu+YE+xVoDfIG7ftplVgTKE ihDrtffD3vKG96aE14285ksBJKDUTynEQ/3dVl3HAb9YzDXiChwmhoQkpwRHTTEGpkvo xvDsm/7ZzXTgDP78GarsbRwlJcl3b4qjsFASExIbUcqOWIOx0vgGk5sFAJS5FXxcFpfQ SUGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=SouTugQU/Mxi+G8oZbVU1lsOLTeuvLYgTsICg1TwYFo=; b=Ztg6YBQS2/5zTtaT63t5BWvUIgW7UtDCAkqSRNichDrrbWqXIjRB2J8IUMrDbOdxCw KT1CPYaoHLJxkpw3t+9kpCpJCRT9ebdW87GSB0NbD6vqWfmLdGwZKVMNL3X0S5d094i6 Dd4qoOwD3SPf77WnNWEbuIaZYGuj9tjTYkdIIfN3FZtepgBLM5B3of3dbf8diKw79Q5W jApWQ1alm6kn+z0trOxLAMcBaQgn68b2Kb4T/mJdplIFNi/GCdbWeTA+un0UrmWFKBlg v1roxT7H8f0JZLya1BepBGciqdKICX878D/E5VF3LeCzVe5ccUuItiuGrdf3fX3pdyLO 95fA== X-Gm-Message-State: AOAM5314gOWxZy6hRbBBiaj61xWcBNnvyxt6AhdUKIOrVYZaKyGT14Bm 75z1g0IEy6Ee43b2orm6ISU= X-Google-Smtp-Source: ABdhPJxz5OYmzESilxGjtOXBsXSCZTup7G/3WM+cO63MuXE6nsgAyh2EZnBRzlJ40G1XjGzakuNCLg== X-Received: by 2002:a9d:4d01:: with SMTP id n1mr3298614otf.294.1599172903814; Thu, 03 Sep 2020 15:41:43 -0700 (PDT) Received: from localhost ([2605:6000:8b03:f000:6a3a:fc5c:851c:306a]) by smtp.gmail.com with ESMTPSA id v20sm385918oiv.47.2020.09.03.15.41.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Sep 2020 15:41:43 -0700 (PDT) From: Bob Pearson X-Google-Original-From: Bob Pearson To: jgg@nvidia.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org Cc: Bob Pearson Subject: [PATCH v4 for-next 5/7] rdma_rxe: Added alloc_mw and dealloc_mw verbs Date: Thu, 3 Sep 2020 17:40:38 -0500 Message-Id: <20200903224039.437391-6-rpearson@hpe.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200903224039.437391-1-rpearson@hpe.com> References: <20200903224039.437391-1-rpearson@hpe.com> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org - Added a new file focused on memory windows, rxe_mw.c. - Added alloc_mw and dealloc_mw verbs and added them to the list of supported user space verbs. Signed-off-by: Bob Pearson --- drivers/infiniband/sw/rxe/Makefile | 1 + drivers/infiniband/sw/rxe/rxe_loc.h | 8 +++ drivers/infiniband/sw/rxe/rxe_mr.c | 77 ++++++++++----------- drivers/infiniband/sw/rxe/rxe_mw.c | 98 +++++++++++++++++++++++++++ drivers/infiniband/sw/rxe/rxe_pool.c | 33 +++++---- drivers/infiniband/sw/rxe/rxe_pool.h | 2 +- drivers/infiniband/sw/rxe/rxe_req.c | 24 +++---- drivers/infiniband/sw/rxe/rxe_resp.c | 4 +- drivers/infiniband/sw/rxe/rxe_verbs.c | 52 +++++++++----- drivers/infiniband/sw/rxe/rxe_verbs.h | 8 +++ include/uapi/rdma/rdma_user_rxe.h | 10 +++ 11 files changed, 232 insertions(+), 85 deletions(-) create mode 100644 drivers/infiniband/sw/rxe/rxe_mw.c diff --git a/drivers/infiniband/sw/rxe/Makefile b/drivers/infiniband/sw/rxe/Makefile index 66af72dca759..1e24673e9318 100644 --- a/drivers/infiniband/sw/rxe/Makefile +++ b/drivers/infiniband/sw/rxe/Makefile @@ -15,6 +15,7 @@ rdma_rxe-y := \ rxe_qp.o \ rxe_cq.o \ rxe_mr.o \ + rxe_mw.o \ rxe_opcode.o \ rxe_mmap.o \ rxe_icrc.o \ diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h index 9ec6bff6863f..65f2e4a94956 100644 --- a/drivers/infiniband/sw/rxe/rxe_loc.h +++ b/drivers/infiniband/sw/rxe/rxe_loc.h @@ -109,6 +109,14 @@ void rxe_mr_cleanup(struct rxe_pool_entry *arg); int advance_dma_data(struct rxe_dma_info *dma, unsigned int length); +/* rxe_mw.c */ +struct ib_mw *rxe_alloc_mw(struct ib_pd *ibpd, enum ib_mw_type type, + struct ib_udata *udata); + +int rxe_dealloc_mw(struct ib_mw *ibmw); + +void rxe_mw_cleanup(struct rxe_pool_entry *arg); + /* rxe_net.c */ void rxe_loopback(struct sk_buff *skb); int rxe_send(struct rxe_pkt_info *pkt, struct sk_buff *skb); diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c index 368012904879..4c53badfa4e9 100644 --- a/drivers/infiniband/sw/rxe/rxe_mr.c +++ b/drivers/infiniband/sw/rxe/rxe_mr.c @@ -7,21 +7,18 @@ #include "rxe.h" #include "rxe_loc.h" -/* - * lfsr (linear feedback shift register) with period 255 +/* choose a unique non zero random number for lkey + * use high order bit to indicate MR vs MW */ -static u8 rxe_get_key(void) +static void rxe_set_mr_lkey(struct rxe_mr *mr) { - static u32 key = 1; - - key = key << 1; - - key |= (0 != (key & 0x100)) ^ (0 != (key & 0x10)) - ^ (0 != (key & 0x80)) ^ (0 != (key & 0x40)); - - key &= 0xff; - - return key; + u32 lkey; +again: + get_random_bytes(&lkey, sizeof(lkey)); + lkey &= ~IS_MW; + if (likely(lkey && (rxe_add_key(mr, &lkey) == 0))) + return; + goto again; } int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length) @@ -49,36 +46,19 @@ int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length) static void rxe_mr_init(int access, struct rxe_mr *mr) { - u32 lkey = mr->pelem.index << 8 | rxe_get_key(); - u32 rkey = (access & IB_ACCESS_REMOTE) ? lkey : 0; - - if (mr->pelem.pool->type == RXE_TYPE_MR) { - mr->ibmr.lkey = lkey; - mr->ibmr.rkey = rkey; - } - - mr->lkey = lkey; - mr->rkey = rkey; + rxe_add_index(mr); + rxe_set_mr_lkey(mr); + if (access & IB_ACCESS_REMOTE) + mr->ibmr.rkey = mr->ibmr.lkey; + + /* TODO should not have two copies of lkey and rkey in mr */ + mr->lkey = mr->ibmr.lkey; + mr->rkey = mr->ibmr.rkey; mr->state = RXE_MEM_STATE_INVALID; mr->type = RXE_MR_TYPE_NONE; mr->map_shift = ilog2(RXE_BUF_PER_MAP); } -void rxe_mr_cleanup(struct rxe_pool_entry *arg) -{ - struct rxe_mr *mr = container_of(arg, typeof(*mr), pelem); - int i; - - ib_umem_release(mr->umem); - - if (mr->map) { - for (i = 0; i < mr->num_map; i++) - kfree(mr->map[i]); - - kfree(mr->map); - } -} - static int rxe_mr_alloc(struct rxe_mr *mr, int num_buf) { int i; @@ -541,9 +521,8 @@ struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key, { struct rxe_mr *mr; struct rxe_dev *rxe = to_rdev(pd->ibpd.device); - int index = key >> 8; - mr = rxe_pool_get_index(&rxe->mr_pool, index); + mr = rxe_pool_get_key(&rxe->mr_pool, &key); if (!mr) return NULL; @@ -558,3 +537,21 @@ struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key, return mr; } + +void rxe_mr_cleanup(struct rxe_pool_entry *arg) +{ + struct rxe_mr *mr = container_of(arg, typeof(*mr), pelem); + int i; + + ib_umem_release(mr->umem); + + if (mr->map) { + for (i = 0; i < mr->num_map; i++) + kfree(mr->map[i]); + + kfree(mr->map); + } + + rxe_drop_index(mr); + rxe_drop_key(mr); +} diff --git a/drivers/infiniband/sw/rxe/rxe_mw.c b/drivers/infiniband/sw/rxe/rxe_mw.c new file mode 100644 index 000000000000..9b52a96f25ba --- /dev/null +++ b/drivers/infiniband/sw/rxe/rxe_mw.c @@ -0,0 +1,98 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* + * Copyright (c) 2020 Hewlett Packard Enterprise, Inc. All rights reserved. + */ + +#include "rxe.h" +#include "rxe_loc.h" + +/* choose a unique non zero random number for rkey + * use high order bit to indicate MR vs MW + */ +static void rxe_set_mw_rkey(struct rxe_mw *mw) +{ + u32 rkey; +again: + get_random_bytes(&rkey, sizeof(rkey)); + rkey |= IS_MW; + if (likely((rkey & ~IS_MW) && + (rxe_add_key(mw, &rkey) == 0))) + return; + goto again; +} + +struct ib_mw *rxe_alloc_mw(struct ib_pd *ibpd, enum ib_mw_type type, + struct ib_udata *udata) +{ + struct rxe_pd *pd = to_rpd(ibpd); + struct rxe_dev *rxe = to_rdev(ibpd->device); + struct rxe_mw *mw; + struct rxe_alloc_mw_resp __user *uresp = NULL; + + if (udata) { + if (udata->outlen < sizeof(*uresp)) + return ERR_PTR(-EINVAL); + uresp = udata->outbuf; + } + + if (unlikely((type != IB_MW_TYPE_1) && + (type != IB_MW_TYPE_2))) + return ERR_PTR(-EINVAL); + + rxe_add_ref(pd); + + mw = rxe_alloc(&rxe->mw_pool); + if (unlikely(!mw)) { + rxe_drop_ref(pd); + return ERR_PTR(-ENOMEM); + } + + rxe_add_index(mw); + rxe_set_mw_rkey(mw); + + spin_lock_init(&mw->lock); + mw->qp = NULL; + mw->mr = NULL; + mw->addr = 0; + mw->length = 0; + mw->ibmw.pd = ibpd; + mw->ibmw.type = type; + mw->state = (type == IB_MW_TYPE_2) ? + RXE_MEM_STATE_FREE : + RXE_MEM_STATE_VALID; + + if (uresp) { + if (copy_to_user(&uresp->index, &mw->pelem.index, + sizeof(uresp->index))) { + rxe_drop_ref(mw); + rxe_drop_ref(pd); + return ERR_PTR(-EFAULT); + } + } + + return &mw->ibmw; +} + +int rxe_dealloc_mw(struct ib_mw *ibmw) +{ + struct rxe_mw *mw = to_rmw(ibmw); + struct rxe_pd *pd = to_rpd(ibmw->pd); + unsigned long flags; + + spin_lock_irqsave(&mw->lock, flags); + mw->state = RXE_MEM_STATE_INVALID; + spin_unlock_irqrestore(&mw->lock, flags); + + rxe_drop_ref(pd); + rxe_drop_ref(mw); + + return 0; +} + +void rxe_mw_cleanup(struct rxe_pool_entry *arg) +{ + struct rxe_mw *mw = container_of(arg, typeof(*mw), pelem); + + rxe_drop_index(mw); + rxe_drop_key(mw); +} diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c index 30b8f037ee20..4bcb19a7b918 100644 --- a/drivers/infiniband/sw/rxe/rxe_pool.c +++ b/drivers/infiniband/sw/rxe/rxe_pool.c @@ -7,13 +7,12 @@ #include "rxe.h" #include "rxe_loc.h" -/* info about object pools - */ +/* info about object pools */ struct rxe_type_info rxe_type_info[RXE_NUM_TYPES] = { [RXE_TYPE_UC] = { .name = "rxe-uc", .size = sizeof(struct rxe_ucontext), - .flags = RXE_POOL_NO_ALLOC, + .flags = RXE_POOL_NO_ALLOC, }, [RXE_TYPE_PD] = { .name = "rxe-pd", @@ -43,23 +42,30 @@ struct rxe_type_info rxe_type_info[RXE_NUM_TYPES] = { [RXE_TYPE_CQ] = { .name = "rxe-cq", .size = sizeof(struct rxe_cq), - .flags = RXE_POOL_NO_ALLOC, + .flags = RXE_POOL_NO_ALLOC, .cleanup = rxe_cq_cleanup, }, [RXE_TYPE_MR] = { .name = "rxe-mr", .size = sizeof(struct rxe_mr), .cleanup = rxe_mr_cleanup, - .flags = RXE_POOL_INDEX, + .flags = RXE_POOL_INDEX + | RXE_POOL_KEY, .max_index = RXE_MAX_MR_INDEX, .min_index = RXE_MIN_MR_INDEX, + .key_offset = offsetof(struct rxe_mr, ibmr.lkey), + .key_size = sizeof(u32), }, [RXE_TYPE_MW] = { .name = "rxe-mw", .size = sizeof(struct rxe_mw), - .flags = RXE_POOL_INDEX, + .cleanup = rxe_mw_cleanup, + .flags = RXE_POOL_INDEX + | RXE_POOL_KEY, .max_index = RXE_MAX_MW_INDEX, .min_index = RXE_MIN_MW_INDEX, + .key_offset = offsetof(struct rxe_mw, ibmw.rkey), + .key_size = sizeof(u32), }, [RXE_TYPE_MC_GRP] = { .name = "rxe-mc_grp", @@ -223,7 +229,7 @@ static void insert_index(struct rxe_pool *pool, struct rxe_pool_entry *new) return; } -static void insert_key(struct rxe_pool *pool, struct rxe_pool_entry *new) +static int insert_key(struct rxe_pool *pool, struct rxe_pool_entry *new) { struct rb_node **link = &pool->key.tree.rb_node; struct rb_node *parent = NULL; @@ -239,7 +245,7 @@ static void insert_key(struct rxe_pool *pool, struct rxe_pool_entry *new) if (cmp == 0) { pr_warn("key already exists!\n"); - goto out; + return -EAGAIN; } if (cmp > 0) @@ -250,20 +256,23 @@ static void insert_key(struct rxe_pool *pool, struct rxe_pool_entry *new) rb_link_node(&new->key_node, parent, link); rb_insert_color(&new->key_node, &pool->key.tree); -out: - return; + + return 0; } -void rxe_add_key(void *arg, void *key) +int rxe_add_key(void *arg, void *key) { + int ret; struct rxe_pool_entry *elem = arg; struct rxe_pool *pool = elem->pool; unsigned long flags; write_lock_irqsave(&pool->pool_lock, flags); memcpy((u8 *)elem + pool->key.key_offset, key, pool->key.key_size); - insert_key(pool, elem); + ret = insert_key(pool, elem); write_unlock_irqrestore(&pool->pool_lock, flags); + + return ret; } void rxe_drop_key(void *arg) diff --git a/drivers/infiniband/sw/rxe/rxe_pool.h b/drivers/infiniband/sw/rxe/rxe_pool.h index 3d722aae5f15..5be975e3d5d3 100644 --- a/drivers/infiniband/sw/rxe/rxe_pool.h +++ b/drivers/infiniband/sw/rxe/rxe_pool.h @@ -122,7 +122,7 @@ void rxe_drop_index(void *elem); /* assign a key to a keyed object and insert object into * pool's rb tree */ -void rxe_add_key(void *elem, void *key); +int rxe_add_key(void *elem, void *key); /* remove elem from rb tree */ void rxe_drop_key(void *elem); diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c index 57236d8c2146..682f30bb3495 100644 --- a/drivers/infiniband/sw/rxe/rxe_req.c +++ b/drivers/infiniband/sw/rxe/rxe_req.c @@ -597,29 +597,29 @@ int rxe_requester(void *arg) if (wqe->mask & WR_REG_MASK) { if (wqe->wr.opcode == IB_WR_LOCAL_INV) { struct rxe_dev *rxe = to_rdev(qp->ibqp.device); - struct rxe_mr *rmr; + struct rxe_mr *mr; - rmr = rxe_pool_get_index(&rxe->mr_pool, - wqe->wr.ex.invalidate_rkey >> 8); - if (!rmr) { + mr = rxe_pool_get_key(&rxe->mr_pool, + &wqe->wr.ex.invalidate_rkey); + if (!mr) { pr_err("No mr for key %#x\n", wqe->wr.ex.invalidate_rkey); wqe->state = wqe_state_error; wqe->status = IB_WC_MW_BIND_ERR; goto exit; } - rmr->state = RXE_MEM_STATE_FREE; - rxe_drop_ref(rmr); + mr->state = RXE_MEM_STATE_FREE; + rxe_drop_ref(mr); wqe->state = wqe_state_done; wqe->status = IB_WC_SUCCESS; } else if (wqe->wr.opcode == IB_WR_REG_MR) { - struct rxe_mr *rmr = to_rmr(wqe->wr.wr.reg.mr); + struct rxe_mr *mr = to_rmr(wqe->wr.wr.reg.mr); - rmr->state = RXE_MEM_STATE_VALID; - rmr->access = wqe->wr.wr.reg.access; - rmr->lkey = wqe->wr.wr.reg.key; - rmr->rkey = wqe->wr.wr.reg.key; - rmr->iova = wqe->wr.wr.reg.mr->iova; + mr->state = RXE_MEM_STATE_VALID; + mr->access = wqe->wr.wr.reg.access; + mr->lkey = wqe->wr.wr.reg.key; + mr->rkey = wqe->wr.wr.reg.key; + mr->iova = wqe->wr.wr.reg.mr->iova; wqe->state = wqe_state_done; wqe->status = IB_WC_SUCCESS; } else { diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c index 69867bf39cfb..885b5bf6dc2e 100644 --- a/drivers/infiniband/sw/rxe/rxe_resp.c +++ b/drivers/infiniband/sw/rxe/rxe_resp.c @@ -888,8 +888,8 @@ static enum resp_states do_complete(struct rxe_qp *qp, wc->wc_flags |= IB_WC_WITH_INVALIDATE; wc->ex.invalidate_rkey = ieth_rkey(pkt); - rmr = rxe_pool_get_index(&rxe->mr_pool, - wc->ex.invalidate_rkey >> 8); + rmr = rxe_pool_get_key(&rxe->mr_pool, + &wc->ex.invalidate_rkey); if (unlikely(!rmr)) { pr_err("Bad rkey %#x invalidation\n", wc->ex.invalidate_rkey); diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c index 0f4c7d2f743a..7d365c762ff5 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.c +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c @@ -865,12 +865,14 @@ static struct ib_mr *rxe_get_dma_mr(struct ib_pd *ibpd, int access) struct rxe_pd *pd = to_rpd(ibpd); struct rxe_mr *mr; + rxe_add_ref(pd); + mr = rxe_alloc(&rxe->mr_pool); - if (!mr) + if (!mr) { + rxe_drop_ref(pd); return ERR_PTR(-ENOMEM); + } - rxe_add_index(mr); - rxe_add_ref(pd); rxe_mr_init_dma(pd, access, mr); return &mr->ibmr; @@ -886,6 +888,17 @@ static struct ib_mr *rxe_reg_user_mr(struct ib_pd *ibpd, struct rxe_dev *rxe = to_rdev(ibpd->device); struct rxe_pd *pd = to_rpd(ibpd); struct rxe_mr *mr; + struct rxe_reg_mr_resp __user *uresp = NULL; + + if (udata) { + if (udata->outlen < sizeof(*uresp)) { + err = -EINVAL; + goto err1; + } + uresp = udata->outbuf; + } + + rxe_add_ref(pd); mr = rxe_alloc(&rxe->mr_pool); if (!mr) { @@ -893,22 +906,25 @@ static struct ib_mr *rxe_reg_user_mr(struct ib_pd *ibpd, goto err2; } - rxe_add_index(mr); - - rxe_add_ref(pd); - err = rxe_mr_init_user(pd, start, length, iova, - access, udata, mr); + access, udata, mr); if (err) goto err3; - return &mr->ibmr; + if (uresp) { + if (copy_to_user(&uresp->index, &mr->pelem.index, + sizeof(uresp->index))) { + err = -EFAULT; + goto err3; + } + } + return &mr->ibmr; err3: - rxe_drop_ref(pd); - rxe_drop_index(mr); rxe_drop_ref(mr); err2: + rxe_drop_ref(pd); +err1: return ERR_PTR(err); } @@ -918,7 +934,6 @@ static int rxe_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata) mr->state = RXE_MEM_STATE_ZOMBIE; rxe_drop_ref(mr->pd); - rxe_drop_index(mr); rxe_drop_ref(mr); return 0; } @@ -934,16 +949,14 @@ static struct ib_mr *rxe_alloc_mr(struct ib_pd *ibpd, enum ib_mr_type mr_type, if (mr_type != IB_MR_TYPE_MEM_REG) return ERR_PTR(-EINVAL); + rxe_add_ref(pd); + mr = rxe_alloc(&rxe->mr_pool); if (!mr) { err = -ENOMEM; goto err1; } - rxe_add_index(mr); - - rxe_add_ref(pd); - err = rxe_mr_init_fast(pd, max_num_sg, mr); if (err) goto err2; @@ -951,10 +964,9 @@ static struct ib_mr *rxe_alloc_mr(struct ib_pd *ibpd, enum ib_mr_type mr_type, return &mr->ibmr; err2: - rxe_drop_ref(pd); - rxe_drop_index(mr); rxe_drop_ref(mr); err1: + rxe_drop_ref(pd); return ERR_PTR(err); } @@ -1101,6 +1113,8 @@ static const struct ib_device_ops rxe_dev_ops = { .reg_user_mr = rxe_reg_user_mr, .req_notify_cq = rxe_req_notify_cq, .resize_cq = rxe_resize_cq, + .alloc_mw = rxe_alloc_mw, + .dealloc_mw = rxe_dealloc_mw, INIT_RDMA_OBJ_SIZE(ib_ah, rxe_ah, ibah), INIT_RDMA_OBJ_SIZE(ib_cq, rxe_cq, ibcq), @@ -1162,6 +1176,8 @@ int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name) | BIT_ULL(IB_USER_VERBS_CMD_DESTROY_AH) | BIT_ULL(IB_USER_VERBS_CMD_ATTACH_MCAST) | BIT_ULL(IB_USER_VERBS_CMD_DETACH_MCAST) + | BIT_ULL(IB_USER_VERBS_CMD_ALLOC_MW) + | BIT_ULL(IB_USER_VERBS_CMD_DEALLOC_MW) ; ib_set_device_ops(dev, &rxe_dev_ops); diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h index dbc649c9c43f..2233630fea7f 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.h +++ b/drivers/infiniband/sw/rxe/rxe_verbs.h @@ -319,6 +319,9 @@ struct rxe_mr { struct rxe_map **map; }; +/* use high order bit to separate MW and MR rkeys */ +#define IS_MW (1 << 31) + struct rxe_mw { struct rxe_pool_entry pelem; struct ib_mw ibmw; @@ -441,6 +444,11 @@ static inline struct rxe_mr *to_rmr(struct ib_mr *mr) return mr ? container_of(mr, struct rxe_mr, ibmr) : NULL; } +static inline struct rxe_mw *to_rmw(struct ib_mw *mw) +{ + return mw ? container_of(mw, struct rxe_mw, ibmw) : NULL; +} + int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name); void rxe_mc_cleanup(struct rxe_pool_entry *arg); diff --git a/include/uapi/rdma/rdma_user_rxe.h b/include/uapi/rdma/rdma_user_rxe.h index d8f2e0e46dab..4ad0fa0b2ab9 100644 --- a/include/uapi/rdma/rdma_user_rxe.h +++ b/include/uapi/rdma/rdma_user_rxe.h @@ -175,4 +175,14 @@ struct rxe_modify_srq_cmd { __aligned_u64 mmap_info_addr; }; +struct rxe_reg_mr_resp { + __u32 index; + __u32 reserved; +}; + +struct rxe_alloc_mw_resp { + __u32 index; + __u32 reserved; +}; + #endif /* RDMA_USER_RXE_H */ From patchwork Thu Sep 3 22:40:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bob Pearson X-Patchwork-Id: 11755461 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 75173109A for ; Thu, 3 Sep 2020 22:41:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 55C4420786 for ; Thu, 3 Sep 2020 22:41:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="dkJvvC+7" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729306AbgICWl4 (ORCPT ); Thu, 3 Sep 2020 18:41:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55702 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728294AbgICWlp (ORCPT ); Thu, 3 Sep 2020 18:41:45 -0400 Received: from mail-ot1-x344.google.com (mail-ot1-x344.google.com [IPv6:2607:f8b0:4864:20::344]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 81FE8C061249 for ; Thu, 3 Sep 2020 15:41:45 -0700 (PDT) Received: by mail-ot1-x344.google.com with SMTP id c10so4189013otm.13 for ; Thu, 03 Sep 2020 15:41:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=LvB0OPx3ax3nLRmKItSEyKR/mWbJUqis49hglZAfEuU=; b=dkJvvC+7217jStSGGSmeH7z7cD7Zn//ja8lzeUyIR1SE7+EennqelQiMi2Du+zp3UE I3mmZBbYUzxWX6faRtY+Ee6InHRDDLwvD41kGVxNEYT0F5fQXB+7ILRM/jqa+D80+Juv TJwLKBX5CQawnPhiqZ4ltwhuetPJCS92UmbkSCoT0BeQuo7KolcSq16gJL6SV6LpaDfR qbNqcXLkgV7R+PkNoUfKqUJcb9lMl6KD0TFnQJt98HSOoAszkzu1G88K7EgSjy9BfILx VMqa45jXf28xrOKqudc/NIcd6lXpHWO12H+o3ilqRf/mKjeBUtLO6hZtF40EacvcjE2i ZgbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=LvB0OPx3ax3nLRmKItSEyKR/mWbJUqis49hglZAfEuU=; b=ja8+eb0TJEglOyQWscXewOj+It9BwK5SSmk3QJpvT9bv0Rgyvp5OdA3fcDYrwFWnbb uKsrQ9vGiJT4M1pXETod8KF0SDIykgRWOr9RQRPfUsJegnAM5zTHC2J5/mLhef942CYo 9QlGM4w2LXOz76rFWiczaGQA2QiYRri9KyVG//JR/JkGNDUBdKdAx4zGkIUNwU5CaLvG TkIgouQcJyhTRXCT9XZxRThy+fFSD1ZV2afmySq9fQmVaHkppQTH7u63urQ9Wwepu5J8 RYCSej6PiPccPrffeTUN9fSJ7sOz1YyVgqeWwCdGM5Z1o4F6oQ7XyhPpYBarb7LXU1SY bc4g== X-Gm-Message-State: AOAM530Wy1k4k5kr34yywUJrZPD8+PWP9pdUybniCroDbsQg6zL5EQ3x mv7pQixVTnCHgOw3ydWIl28= X-Google-Smtp-Source: ABdhPJwbMzPkAkNeP7AVMuayrdtLtp0vJXiEM7hlUEv0q2N9KN0MdqZlA67jzuyv/6Ss30cFEUM35Q== X-Received: by 2002:a05:6830:13:: with SMTP id c19mr3436385otp.65.1599172904805; Thu, 03 Sep 2020 15:41:44 -0700 (PDT) Received: from localhost ([2605:6000:8b03:f000:6a3a:fc5c:851c:306a]) by smtp.gmail.com with ESMTPSA id r6sm855865otc.0.2020.09.03.15.41.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Sep 2020 15:41:44 -0700 (PDT) From: Bob Pearson X-Google-Original-From: Bob Pearson To: jgg@nvidia.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org Cc: Bob Pearson Subject: [PATCH v4 for-next 6/7] rdma_rxe: added bind_mw and invalidate_mw verbs Date: Thu, 3 Sep 2020 17:40:39 -0500 Message-Id: <20200903224039.437391-7-rpearson@hpe.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200903224039.437391-1-rpearson@hpe.com> References: <20200903224039.437391-1-rpearson@hpe.com> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org - Added code to implement ibv_bind_mw (for type 1 MWs) and post send queue bind_mw (for type 2 MWs). - Added code to implement local (post send) and remote (send with invalidate) invalidate operations. - Added rules checking for MW operations from IBA. Signed-off-by: Bob Pearson --- drivers/infiniband/sw/rxe/rxe_comp.c | 1 + drivers/infiniband/sw/rxe/rxe_loc.h | 2 + drivers/infiniband/sw/rxe/rxe_mr.c | 3 +- drivers/infiniband/sw/rxe/rxe_mw.c | 289 ++++++++++++++++++++++++- drivers/infiniband/sw/rxe/rxe_opcode.c | 11 +- drivers/infiniband/sw/rxe/rxe_opcode.h | 1 - drivers/infiniband/sw/rxe/rxe_req.c | 81 +++++-- drivers/infiniband/sw/rxe/rxe_verbs.c | 2 +- drivers/infiniband/sw/rxe/rxe_verbs.h | 7 + include/uapi/rdma/rdma_user_rxe.h | 34 ++- 10 files changed, 399 insertions(+), 32 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c index 5dc86c9e74c2..8b81d3b24a8a 100644 --- a/drivers/infiniband/sw/rxe/rxe_comp.c +++ b/drivers/infiniband/sw/rxe/rxe_comp.c @@ -103,6 +103,7 @@ static enum ib_wc_opcode wr_to_wc_opcode(enum ib_wr_opcode opcode) case IB_WR_RDMA_READ_WITH_INV: return IB_WC_RDMA_READ; case IB_WR_LOCAL_INV: return IB_WC_LOCAL_INV; case IB_WR_REG_MR: return IB_WC_REG_MR; + case IB_WR_BIND_MW: return IB_WC_BIND_MW; default: return 0xff; diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h index 65f2e4a94956..d9a4004fddaa 100644 --- a/drivers/infiniband/sw/rxe/rxe_loc.h +++ b/drivers/infiniband/sw/rxe/rxe_loc.h @@ -117,6 +117,8 @@ int rxe_dealloc_mw(struct ib_mw *ibmw); void rxe_mw_cleanup(struct rxe_pool_entry *arg); +int rxe_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe); + /* rxe_net.c */ void rxe_loopback(struct sk_buff *skb); int rxe_send(struct rxe_pkt_info *pkt, struct sk_buff *skb); diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c index 4c53badfa4e9..f506dff25fdf 100644 --- a/drivers/infiniband/sw/rxe/rxe_mr.c +++ b/drivers/infiniband/sw/rxe/rxe_mr.c @@ -543,7 +543,8 @@ void rxe_mr_cleanup(struct rxe_pool_entry *arg) struct rxe_mr *mr = container_of(arg, typeof(*mr), pelem); int i; - ib_umem_release(mr->umem); + if (mr->umem) + ib_umem_release(mr->umem); if (mr->map) { for (i = 0; i < mr->num_map; i++) diff --git a/drivers/infiniband/sw/rxe/rxe_mw.c b/drivers/infiniband/sw/rxe/rxe_mw.c index 9b52a96f25ba..9221726e94c2 100644 --- a/drivers/infiniband/sw/rxe/rxe_mw.c +++ b/drivers/infiniband/sw/rxe/rxe_mw.c @@ -30,7 +30,7 @@ struct ib_mw *rxe_alloc_mw(struct ib_pd *ibpd, enum ib_mw_type type, struct rxe_alloc_mw_resp __user *uresp = NULL; if (udata) { - if (udata->outlen < sizeof(*uresp)) + if (unlikely(udata->outlen < sizeof(*uresp))) return ERR_PTR(-EINVAL); uresp = udata->outbuf; } @@ -62,10 +62,9 @@ struct ib_mw *rxe_alloc_mw(struct ib_pd *ibpd, enum ib_mw_type type, RXE_MEM_STATE_VALID; if (uresp) { - if (copy_to_user(&uresp->index, &mw->pelem.index, - sizeof(uresp->index))) { + if (unlikely(copy_to_user(&uresp->index, &mw->pelem.index, + sizeof(uresp->index)))) { rxe_drop_ref(mw); - rxe_drop_ref(pd); return ERR_PTR(-EFAULT); } } @@ -73,22 +72,298 @@ struct ib_mw *rxe_alloc_mw(struct ib_pd *ibpd, enum ib_mw_type type, return &mw->ibmw; } +/* cleanup mw in case someone is still holding a ref */ +static void do_dealloc_mw(struct rxe_mw *mw) +{ + if (mw->mr) { + rxe_drop_ref(mw->mr); + atomic_dec(&mw->mr->num_mw); + mw->mr = NULL; + } + + mw->qp = NULL; + mw->access = 0; + mw->addr = 0; + mw->length = 0; + mw->state = RXE_MEM_STATE_INVALID; +} + int rxe_dealloc_mw(struct ib_mw *ibmw) { struct rxe_mw *mw = to_rmw(ibmw); - struct rxe_pd *pd = to_rpd(ibmw->pd); unsigned long flags; spin_lock_irqsave(&mw->lock, flags); - mw->state = RXE_MEM_STATE_INVALID; + + do_dealloc_mw(mw); + + spin_unlock_irqrestore(&mw->lock, flags); + + rxe_drop_ref(mw); + + return 0; +} + +/* Check the rules for bind MW oepration. */ +static int check_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe, + struct rxe_mw *mw, struct rxe_mr *mr) +{ + /* check to see if bind operation came through + * ibv_bind_mw verbs API. + */ + switch (mw->ibmw.type) { + case IB_MW_TYPE_1: + /* o10-37.2.34 */ + if (unlikely(!(wqe->wr.wr.umw.flags & RXE_BIND_MW))) { + pr_err_once("attempt to bind type 1 MW with send WR\n"); + return -EINVAL; + } + break; + case IB_MW_TYPE_2: + /* o10-37.2.35 */ + if (unlikely(wqe->wr.wr.umw.flags & RXE_BIND_MW)) { + pr_err_once("attempt to bind type 2 MW with verbs API\n"); + return -EINVAL; + } + + /* C10-72 */ + if (unlikely(qp->pd != to_rpd(mw->ibmw.pd))) { + pr_err_once("attempt to bind type 2 MW with qp with different PD\n"); + return -EINVAL; + } + + /* o10-37.2.40 */ + if (unlikely(wqe->wr.wr.umw.length == 0)) { + pr_err_once("attempt to invalidate type 2 MW by binding with zero length\n"); + return -EINVAL; + } + + if (unlikely(!mr)) { + pr_err_once("attempt to bind MW to a NULL mr\n"); + return -EINVAL; + } + break; + default: + return -EINVAL; + } + + if (unlikely((mw->ibmw.type == IB_MW_TYPE_1) && + (mw->state != RXE_MEM_STATE_VALID))) { + pr_err_once("attempt to bind a type 1 MW not in the valid state\n"); + return -EINVAL; + } + + /* o10-36.2.2 */ + if (unlikely((mw->access & IB_ZERO_BASED) && + (mw->ibmw.type == IB_MW_TYPE_1))) { + pr_err_once("attempt to bind a zero based type 1 MW\n"); + return -EINVAL; + } + + if (unlikely((wqe->wr.wr.umw.rkey & 0xff) == (mw->ibmw.rkey & 0xff))) { + pr_err_once("attempt to bind MW with same key\n"); + return -EINVAL; + } + + /* remaining checks only apply to a nonzero MR */ + if (!mr) + return 0; + + if (unlikely(mr->access & IB_ZERO_BASED)) { + pr_err_once("attempt to bind MW to zero based MR\n"); + return -EINVAL; + } + + /* o10-37.2.30 */ + if (unlikely((mw->ibmw.type == IB_MW_TYPE_2) && + (mw->state != RXE_MEM_STATE_FREE))) { + pr_err_once("attempt to bind a type 2 MW not in the free state\n"); + return -EINVAL; + } + + /* C10-73 */ + if (unlikely(!(mr->access & IB_ACCESS_MW_BIND))) { + pr_err_once("attempt to bind an MW to an MR without bind access\n"); + return -EINVAL; + } + + /* C10-74 */ + if (unlikely((mw->access & (IB_ACCESS_REMOTE_WRITE | + IB_ACCESS_REMOTE_ATOMIC)) && + !(mr->access & IB_ACCESS_LOCAL_WRITE))) { + pr_err_once("attempt to bind an writeable MW to an MR without local write access\n"); + return -EINVAL; + } + + /* C10-75 */ + if (mw->access & IB_ZERO_BASED) { + if (unlikely(wqe->wr.wr.umw.length > mr->length)) { + pr_err_once("attempt to bind a ZB MW outside of the MR\n"); + return -EINVAL; + } + } else { + if (unlikely((wqe->wr.wr.umw.addr < mr->iova) || + ((wqe->wr.wr.umw.addr + wqe->wr.wr.umw.length) > + (mr->iova + mr->length)))) { + pr_err_once("attempt to bind a VA MW outside of the MR\n"); + return -EINVAL; + } + } + + return 0; +} + +static int do_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe, + struct rxe_mw *mw, struct rxe_mr *mr) +{ + u32 rkey; + u32 new_rkey; + struct rxe_mw *duplicate_mw; + struct rxe_dev *rxe = to_rdev(qp->ibqp.device); + + /* key part of new rkey is provided by user for type 2 + * and ibv_bind_mw() for type 1 MWs + * there is a very rare chance that the new rkey will + * collide with an existing MW. Return an error if this + * occurs + */ + rkey = mw->ibmw.rkey; + new_rkey = (rkey & 0xffffff00) | (wqe->wr.wr.umw.rkey & 0x000000ff); + duplicate_mw = rxe_pool_get_key(&rxe->mw_pool, &new_rkey); + if (duplicate_mw) { + pr_err_once("new MW key is a duplicate, try another\n"); + rxe_drop_ref(duplicate_mw); + return -EINVAL; + } + + rxe_drop_key(mw); + rxe_add_key(mw, &new_rkey); + + mw->access = wqe->wr.wr.umw.access; + mw->state = RXE_MEM_STATE_VALID; + mw->addr = wqe->wr.wr.umw.addr; + mw->length = wqe->wr.wr.umw.length; + + if (mw->mr) { + rxe_drop_ref(mw->mr); + atomic_dec(&mw->mr->num_mw); + mw->mr = NULL; + } + + if (mw->length) { + mw->mr = mr; + atomic_inc(&mr->num_mw); + rxe_add_ref(mr); + } + + if (mw->ibmw.type == IB_MW_TYPE_2) + mw->qp = qp; + + return 0; +} + +int rxe_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe) +{ + int ret; + struct rxe_mw *mw; + struct rxe_mr *mr; + struct rxe_dev *rxe = to_rdev(qp->ibqp.device); + unsigned long flags; + + if (qp->is_user) { + mw = rxe_pool_get_index(&rxe->mw_pool, + wqe->wr.wr.umw.mw_index); + if (!mw) { + pr_err_once("mw with index = %d not found\n", + wqe->wr.wr.umw.mw_index); + ret = -EINVAL; + goto err1; + } + mr = rxe_pool_get_index(&rxe->mr_pool, + wqe->wr.wr.umw.mr_index); + if (!mr && wqe->wr.wr.umw.length) { + pr_err_once("mr with index = %d not found\n", + wqe->wr.wr.umw.mr_index); + ret = -EINVAL; + goto err2; + } + } else { + mw = to_rmw(wqe->wr.wr.kmw.mw); + rxe_add_ref(mw); + if (wqe->wr.wr.kmw.mr) { + mr = to_rmr(wqe->wr.wr.kmw.mr); + rxe_add_ref(mr); + } else { + mr = NULL; + } + } + + spin_lock_irqsave(&mw->lock, flags); + + ret = check_bind_mw(qp, wqe, mw, mr); + if (ret) + goto err3; + + ret = do_bind_mw(qp, wqe, mw, mr); +err3: spin_unlock_irqrestore(&mw->lock, flags); - rxe_drop_ref(pd); + if (mr) + rxe_drop_ref(mr); +err2: rxe_drop_ref(mw); +err1: + return ret; +} + +static int check_invalidate_mw(struct rxe_qp *qp, struct rxe_mw *mw) +{ + if (unlikely(mw->state != RXE_MEM_STATE_VALID)) { + pr_err_once("attempt to invalidate a MW that is not valid\n"); + return -EINVAL; + } + + /* o10-37.2.26 */ + if (unlikely(mw->ibmw.type == IB_MW_TYPE_1)) { + pr_err_once("attempt to invalidate a type 1 MW\n"); + return -EINVAL; + } return 0; } +static void do_invalidate_mw(struct rxe_mw *mw) +{ + mw->qp = NULL; + + rxe_drop_ref(mw->mr); + atomic_dec(&mw->mr->num_mw); + mw->mr = NULL; + + mw->access = 0; + mw->addr = 0; + mw->length = 0; + mw->state = RXE_MEM_STATE_FREE; +} + +int rxe_invalidate_mw(struct rxe_qp *qp, struct rxe_mw *mw) +{ + int ret; + unsigned long flags; + + spin_lock_irqsave(&mw->lock, flags); + + ret = check_invalidate_mw(qp, mw); + if (ret) + goto err; + + do_invalidate_mw(mw); +err: + spin_unlock_irqrestore(&mw->lock, flags); + + return ret; +} + void rxe_mw_cleanup(struct rxe_pool_entry *arg) { struct rxe_mw *mw = container_of(arg, typeof(*mw), pelem); diff --git a/drivers/infiniband/sw/rxe/rxe_opcode.c b/drivers/infiniband/sw/rxe/rxe_opcode.c index 0cb4b01fd910..5532f01ae5a3 100644 --- a/drivers/infiniband/sw/rxe/rxe_opcode.c +++ b/drivers/infiniband/sw/rxe/rxe_opcode.c @@ -87,13 +87,20 @@ struct rxe_wr_opcode_info rxe_wr_opcode_info[] = { [IB_WR_LOCAL_INV] = { .name = "IB_WR_LOCAL_INV", .mask = { - [IB_QPT_RC] = WR_REG_MASK, + [IB_QPT_RC] = WR_LOCAL_MASK, }, }, [IB_WR_REG_MR] = { .name = "IB_WR_REG_MR", .mask = { - [IB_QPT_RC] = WR_REG_MASK, + [IB_QPT_RC] = WR_LOCAL_MASK, + }, + }, + [IB_WR_BIND_MW] = { + .name = "IB_WR_BIND_MW", + .mask = { + [IB_QPT_RC] = WR_LOCAL_MASK, + [IB_QPT_UC] = WR_LOCAL_MASK, }, }, }; diff --git a/drivers/infiniband/sw/rxe/rxe_opcode.h b/drivers/infiniband/sw/rxe/rxe_opcode.h index 1041ac9a9233..440e34f446bd 100644 --- a/drivers/infiniband/sw/rxe/rxe_opcode.h +++ b/drivers/infiniband/sw/rxe/rxe_opcode.h @@ -20,7 +20,6 @@ enum rxe_wr_mask { WR_READ_MASK = BIT(3), WR_WRITE_MASK = BIT(4), WR_LOCAL_MASK = BIT(5), - WR_REG_MASK = BIT(6), WR_READ_OR_WRITE_MASK = WR_READ_MASK | WR_WRITE_MASK, WR_READ_WRITE_OR_SEND_MASK = WR_READ_OR_WRITE_MASK | WR_SEND_MASK, diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c index 682f30bb3495..39ca88030d3a 100644 --- a/drivers/infiniband/sw/rxe/rxe_req.c +++ b/drivers/infiniband/sw/rxe/rxe_req.c @@ -524,9 +524,9 @@ static void save_state(struct rxe_send_wqe *wqe, struct rxe_send_wqe *rollback_wqe, u32 *rollback_psn) { - rollback_wqe->state = wqe->state; + rollback_wqe->state = wqe->state; rollback_wqe->first_psn = wqe->first_psn; - rollback_wqe->last_psn = wqe->last_psn; + rollback_wqe->last_psn = wqe->last_psn; *rollback_psn = qp->req.psn; } @@ -559,6 +559,8 @@ static void update_state(struct rxe_qp *qp, struct rxe_send_wqe *wqe, int rxe_requester(void *arg) { struct rxe_qp *qp = (struct rxe_qp *)arg; + struct rxe_dev *rxe = to_rdev(qp->ibqp.device); + struct rxe_mr *mr; struct rxe_pkt_info pkt; struct sk_buff *skb; struct rxe_send_wqe *wqe; @@ -594,11 +596,9 @@ int rxe_requester(void *arg) if (unlikely(!wqe)) goto exit; - if (wqe->mask & WR_REG_MASK) { - if (wqe->wr.opcode == IB_WR_LOCAL_INV) { - struct rxe_dev *rxe = to_rdev(qp->ibqp.device); - struct rxe_mr *mr; - + if (wqe->mask & WR_LOCAL_MASK) { + switch (wqe->wr.opcode) { + case IB_WR_LOCAL_INV: mr = rxe_pool_get_key(&rxe->mr_pool, &wqe->wr.ex.invalidate_rkey); if (!mr) { @@ -606,15 +606,15 @@ int rxe_requester(void *arg) wqe->wr.ex.invalidate_rkey); wqe->state = wqe_state_error; wqe->status = IB_WC_MW_BIND_ERR; - goto exit; + goto err; } mr->state = RXE_MEM_STATE_FREE; rxe_drop_ref(mr); wqe->state = wqe_state_done; wqe->status = IB_WC_SUCCESS; - } else if (wqe->wr.opcode == IB_WR_REG_MR) { - struct rxe_mr *mr = to_rmr(wqe->wr.wr.reg.mr); - + break; + case IB_WR_REG_MR: + mr = to_rmr(wqe->wr.wr.reg.mr); mr->state = RXE_MEM_STATE_VALID; mr->access = wqe->wr.wr.reg.access; mr->lkey = wqe->wr.wr.reg.key; @@ -622,14 +622,30 @@ int rxe_requester(void *arg) mr->iova = wqe->wr.wr.reg.mr->iova; wqe->state = wqe_state_done; wqe->status = IB_WC_SUCCESS; - } else { - goto exit; + break; + case IB_WR_BIND_MW: + ret = rxe_bind_mw(qp, wqe); + if (ret) { + wqe->state = wqe_state_done; + wqe->status = IB_WC_MW_BIND_ERR; + goto err; + } + wqe->state = wqe_state_done; + wqe->status = IB_WC_SUCCESS; + break; + default: + pr_err_once("unexpected LOCAL WR opcode = %d\n", + wqe->wr.opcode); + goto err; } + + qp->req.wqe_index = next_index(qp->sq.queue, + qp->req.wqe_index); + if ((wqe->wr.send_flags & IB_SEND_SIGNALED) || qp->sq_sig_type == IB_SIGNAL_ALL_WR) rxe_run_task(&qp->comp.task, 1); - qp->req.wqe_index = next_index(qp->sq.queue, - qp->req.wqe_index); + goto next_wqe; } @@ -649,6 +665,7 @@ int rxe_requester(void *arg) opcode = next_opcode(qp, wqe, wqe->wr.opcode); if (unlikely(opcode < 0)) { wqe->status = IB_WC_LOC_QP_OP_ERR; + /* TODO this should be goto err */ goto exit; } @@ -678,8 +695,7 @@ int rxe_requester(void *arg) wqe->state = wqe_state_done; wqe->status = IB_WC_SUCCESS; __rxe_do_task(&qp->comp.task); - rxe_drop_ref(qp); - return 0; + goto again; } payload = mtu; } @@ -687,12 +703,14 @@ int rxe_requester(void *arg) skb = init_req_packet(qp, wqe, opcode, payload, &pkt); if (unlikely(!skb)) { pr_err("qp#%d Failed allocating skb\n", qp_num(qp)); + wqe->status = IB_WC_LOC_PROT_ERR; goto err; } if (fill_packet(qp, wqe, &pkt, skb, payload)) { pr_debug("qp#%d Error during fill packet\n", qp_num(qp)); kfree_skb(skb); + wqe->status = IB_WC_LOC_PROT_ERR; goto err; } @@ -716,6 +734,7 @@ int rxe_requester(void *arg) goto exit; } + wqe->status = IB_WC_LOC_PROT_ERR; goto err; } @@ -724,11 +743,35 @@ int rxe_requester(void *arg) goto next_wqe; err: - wqe->status = IB_WC_LOC_PROT_ERR; + /* we come here if an error occurred while processing + * a send wqe. The completer will put the qp in error + * state and no more wqes will be processed unless + * the qp is cleaned up and restarted. We do not want + * to be called again + */ wqe->state = wqe_state_error; __rxe_do_task(&qp->comp.task); + ret = -EAGAIN; + goto done; exit: + /* we come here if either there are no more wqes in the send + * queue or we are blocked waiting for some resource or event. + * The current wqe will be restarted or new wqe started when + * there is work to do or we can complete the current wqe. + */ + ret = -EAGAIN; + goto done; + +again: + /* we come here if we are done with the current wqe but want to + * get called again. Mostly we loop back to next wqe so should + * be all one way or the other + */ + ret = 0; + goto done; + +done: rxe_drop_ref(qp); - return -EAGAIN; + return ret; } diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c index 7d365c762ff5..16f588172dba 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.c +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c @@ -574,7 +574,7 @@ static int init_send_wqe(struct rxe_qp *qp, const struct ib_send_wr *ibwr, p += sge->length; } - } else if (mask & WR_REG_MASK) { + } else if (mask & WR_LOCAL_MASK) { wqe->mask = mask; wqe->state = wqe_state_posted; return 0; diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h index 2233630fea7f..2fb5581edd8a 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.h +++ b/drivers/infiniband/sw/rxe/rxe_verbs.h @@ -316,9 +316,16 @@ struct rxe_mr { u32 max_buf; u32 num_map; + atomic_t num_mw; + struct rxe_map **map; }; +enum rxe_send_flags { + /* flag indicaes bind call came through verbs API */ + RXE_BIND_MW = (1 << 0), +}; + /* use high order bit to separate MW and MR rkeys */ #define IS_MW (1 << 31) diff --git a/include/uapi/rdma/rdma_user_rxe.h b/include/uapi/rdma/rdma_user_rxe.h index 4ad0fa0b2ab9..d49125682359 100644 --- a/include/uapi/rdma/rdma_user_rxe.h +++ b/include/uapi/rdma/rdma_user_rxe.h @@ -93,7 +93,39 @@ struct rxe_send_wr { __u32 remote_qkey; __u16 pkey_index; } ud; - /* reg is only used by the kernel and is not part of the uapi */ + struct { + __aligned_u64 addr; + __aligned_u64 length; + union { + __u32 mr_index; + __aligned_u64 reserved1; + }; + union { + __u32 mw_index; + __aligned_u64 reserved2; + }; + __u32 rkey; + __u32 access; + __u32 flags; + } umw; + /* The following are only used by the kernel + * and are not part of the uapi + */ + struct { + __aligned_u64 addr; + __aligned_u64 length; + union { + struct ib_mr *mr; + __aligned_u64 reserved1; + }; + union { + struct ib_mw *mw; + __aligned_u64 reserved2; + }; + __u32 rkey; + __u32 access; + __u32 flags; + } kmw; struct { union { struct ib_mr *mr; From patchwork Thu Sep 3 22:40:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bob Pearson X-Patchwork-Id: 11755463 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5E65C618 for ; Thu, 3 Sep 2020 22:42:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2CCBE20716 for ; Thu, 3 Sep 2020 22:42:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="FwEZ8Lkv" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729353AbgICWl4 (ORCPT ); Thu, 3 Sep 2020 18:41:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55684 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728486AbgICWlr (ORCPT ); Thu, 3 Sep 2020 18:41:47 -0400 Received: from mail-oi1-x241.google.com (mail-oi1-x241.google.com [IPv6:2607:f8b0:4864:20::241]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7F8D3C061245 for ; Thu, 3 Sep 2020 15:41:46 -0700 (PDT) Received: by mail-oi1-x241.google.com with SMTP id i17so4735782oig.10 for ; Thu, 03 Sep 2020 15:41:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=hPPLIU7w4zlg+uqFZTB04qwcFQvR04r359zsYVuV3AY=; b=FwEZ8LkvjXlwc1CpLjMw/O8a8vaYqPYeQG9YMSCz1H/V/gdXUrLRIRdYmj6w9YgFF+ jIkq2jU2zbprQTMyF7kiEcYYJ3xoMMcXjPWWG+wbLlADnLU6kZ4bVejk9QsRJiR5kk2s oQsKfEPhWWfSIfeviWAePkxI66fRE6DI/y17zDJmYZhytD3oIXdjVtOt+37RTvkNbWrR aJgjrxUCizAfQGO4HGo3kKiD6o+YhNd8vstFhY4wXpp/99QOB7rGdIYDJecCY6gvU1Nd DycZeQK8OXaBxFjF7mju2RzAjQKd1oLFgfyRN7UsCLMMQCQyToXTSSiH3201fijQVJHu 2UoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=hPPLIU7w4zlg+uqFZTB04qwcFQvR04r359zsYVuV3AY=; b=tTFG3eVpKZEFZx7UTJALA41pcZzzSQCjvIOAu3odVrzwNgbkLrCcWCPoPNkdc9QqTG lCoVbdPXpj1Tp/mRh1hp7nno5xkcj22GdMNR5un5PtaA0G7HFFNL/073w4Ei65gxi7Pg FV0GqfCH3TadE9YSBvyOFZsTgSPScoaiLeZpxMdI1kglVaNzjifk3IVcZvs48MewlLOH LUdtNdCcw2BeBw6/M+5sd8x5/SciFRUEycHt1CPE9mxMrxPHbQYav6F2T68gooXfhUcH JuyOA1+mXxSIoeHgPePErX9V5DPjN34KOOwklfNW+P25WZC0qrIQEssfbJWGzL/icA+N +Tjw== X-Gm-Message-State: AOAM530E4hfe4nZJGYiB3JfMO704WD9wtPs95FW4RIBKr5hjXByQ1nNy n5V5mujhnqPbl4+XC04knIU= X-Google-Smtp-Source: ABdhPJzvLP+widMJfT2rw7XW1n43bu+EJIunYvzs805oEXE6dEiZMhJqmJwwNmoP9oG/t1pJjQnCsw== X-Received: by 2002:aca:3a08:: with SMTP id h8mr3495454oia.164.1599172905727; Thu, 03 Sep 2020 15:41:45 -0700 (PDT) Received: from localhost ([2605:6000:8b03:f000:6a3a:fc5c:851c:306a]) by smtp.gmail.com with ESMTPSA id v7sm801873oie.9.2020.09.03.15.41.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Sep 2020 15:41:45 -0700 (PDT) From: Bob Pearson X-Google-Original-From: Bob Pearson To: jgg@nvidia.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org Cc: Bob Pearson Subject: [PATCH v4 for-next 7/7] rdma_rxe: add memory access through MWs Date: Thu, 3 Sep 2020 17:40:40 -0500 Message-Id: <20200903224039.437391-8-rpearson@hpe.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200903224039.437391-1-rpearson@hpe.com> References: <20200903224039.437391-1-rpearson@hpe.com> MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Implemented memory access through MWs. Added rules checks from IBA. Signed-off-by: Bob Pearson --- drivers/infiniband/sw/rxe/rxe_loc.h | 17 +++--- drivers/infiniband/sw/rxe/rxe_mr.c | 74 ++++++++++++++++-------- drivers/infiniband/sw/rxe/rxe_mw.c | 57 +++++++++++++++--- drivers/infiniband/sw/rxe/rxe_req.c | 16 ++---- drivers/infiniband/sw/rxe/rxe_resp.c | 83 ++++++++++++++++++++------- drivers/infiniband/sw/rxe/rxe_verbs.h | 1 + 6 files changed, 176 insertions(+), 72 deletions(-) diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h index d9a4004fddaa..bd8fe4086fd4 100644 --- a/drivers/infiniband/sw/rxe/rxe_loc.h +++ b/drivers/infiniband/sw/rxe/rxe_loc.h @@ -100,25 +100,28 @@ enum lookup_type { lookup_remote, }; -struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key, - enum lookup_type type); +int advance_dma_data(struct rxe_dma_info *dma, unsigned int length); -int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length); +int rxe_mr_check_access(struct rxe_qp *qp, struct rxe_mr *mr, + int access, u64 va, u32 resid); void rxe_mr_cleanup(struct rxe_pool_entry *arg); -int advance_dma_data(struct rxe_dma_info *dma, unsigned int length); - /* rxe_mw.c */ struct ib_mw *rxe_alloc_mw(struct ib_pd *ibpd, enum ib_mw_type type, struct ib_udata *udata); int rxe_dealloc_mw(struct ib_mw *ibmw); -void rxe_mw_cleanup(struct rxe_pool_entry *arg); - int rxe_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe); +int rxe_invalidate_mw(struct rxe_qp *qp, struct rxe_mw *mw); + +int rxe_mw_check_access(struct rxe_qp *qp, struct rxe_mw *mw, + int access, u64 va, u32 resid); + +void rxe_mw_cleanup(struct rxe_pool_entry *arg); + /* rxe_net.c */ void rxe_loopback(struct sk_buff *skb); int rxe_send(struct rxe_pkt_info *pkt, struct sk_buff *skb); diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c index f506dff25fdf..9a1fb125679a 100644 --- a/drivers/infiniband/sw/rxe/rxe_mr.c +++ b/drivers/infiniband/sw/rxe/rxe_mr.c @@ -21,7 +21,7 @@ static void rxe_set_mr_lkey(struct rxe_mr *mr) goto again; } -int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length) +static int mr_check_range(struct rxe_mr *mr, u64 iova, size_t length) { switch (mr->type) { case RXE_MR_TYPE_DMA: @@ -380,6 +380,25 @@ int rxe_mr_copy(struct rxe_mr *mr, u64 iova, void *addr, int length, return err; } +static struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 lkey) +{ + struct rxe_mr *mr; + struct rxe_dev *rxe = to_rdev(pd->ibpd.device); + + mr = rxe_pool_get_key(&rxe->mr_pool, &lkey); + if (!mr) + return NULL; + + if (unlikely((mr->ibmr.lkey != lkey) || (mr->pd != pd) || + (access && !(access & mr->access)) || + (mr->state != RXE_MEM_STATE_VALID))) { + rxe_drop_ref(mr); + return NULL; + } + + return mr; +} + /* copy data in or out of a wqe, i.e. sg list * under the control of a dma descriptor */ @@ -409,7 +428,7 @@ int copy_data( } if (sge->length && (offset < sge->length)) { - mr = lookup_mr(pd, access, sge->lkey, lookup_local); + mr = lookup_mr(pd, access, sge->lkey); if (!mr) { err = -EINVAL; goto err1; @@ -434,8 +453,7 @@ int copy_data( } if (sge->length) { - mr = lookup_mr(pd, access, sge->lkey, - lookup_local); + mr = lookup_mr(pd, access, sge->lkey); if (!mr) { err = -EINVAL; goto err1; @@ -510,32 +528,38 @@ int advance_dma_data(struct rxe_dma_info *dma, unsigned int length) return 0; } -/* (1) find the mr corresponding to lkey/rkey - * depending on lookup_type - * (2) verify that the (qp) pd matches the mr pd - * (3) verify that the mr can support the requested access - * (4) verify that mr state is valid - */ -struct rxe_mr *lookup_mr(struct rxe_pd *pd, int access, u32 key, - enum lookup_type type) +int rxe_invalidate_mr(struct rxe_qp *qp, struct rxe_mr *mr) { - struct rxe_mr *mr; - struct rxe_dev *rxe = to_rdev(pd->ibpd.device); + mr->state = RXE_MEM_STATE_FREE; + return 0; +} - mr = rxe_pool_get_key(&rxe->mr_pool, &key); - if (!mr) - return NULL; +int rxe_mr_check_access(struct rxe_qp *qp, struct rxe_mr *mr, + int access, u64 va, u32 resid) +{ + int ret; + struct rxe_pd *pd = to_rpd(mr->ibmr.pd); - if (unlikely((type == lookup_local && mr->lkey != key) || - (type == lookup_remote && mr->rkey != key) || - mr->pd != pd || - (access && !(access & mr->access)) || - mr->state != RXE_MEM_STATE_VALID)) { - rxe_drop_ref(mr); - mr = NULL; + if (unlikely(mr->state != RXE_MEM_STATE_VALID)) { + pr_err("attempt to access a MR that is not in the valid state\n"); + return -EINVAL; } - return mr; + /* C10-56 */ + if (unlikely(pd != qp->pd)) { + pr_err("attempt to access a MR with a different PD than the QP\n"); + return -EINVAL; + } + + /* C10-57 */ + if (unlikely(access && !(access & mr->access))) { + pr_err("attempt to access a MR without required access rights\n"); + return -EINVAL; + } + + ret = mr_check_range(mr, va, resid); + + return ret; } void rxe_mr_cleanup(struct rxe_pool_entry *arg) diff --git a/drivers/infiniband/sw/rxe/rxe_mw.c b/drivers/infiniband/sw/rxe/rxe_mw.c index 9221726e94c2..c4fda759875a 100644 --- a/drivers/infiniband/sw/rxe/rxe_mw.c +++ b/drivers/infiniband/sw/rxe/rxe_mw.c @@ -318,11 +318,6 @@ int rxe_bind_mw(struct rxe_qp *qp, struct rxe_send_wqe *wqe) static int check_invalidate_mw(struct rxe_qp *qp, struct rxe_mw *mw) { - if (unlikely(mw->state != RXE_MEM_STATE_VALID)) { - pr_err_once("attempt to invalidate a MW that is not valid\n"); - return -EINVAL; - } - /* o10-37.2.26 */ if (unlikely(mw->ibmw.type == IB_MW_TYPE_1)) { pr_err_once("attempt to invalidate a type 1 MW\n"); @@ -336,9 +331,11 @@ static void do_invalidate_mw(struct rxe_mw *mw) { mw->qp = NULL; - rxe_drop_ref(mw->mr); - atomic_dec(&mw->mr->num_mw); - mw->mr = NULL; + if (mw->mr) { + atomic_dec(&mw->mr->num_mw); + mw->mr = NULL; + rxe_drop_ref(mw->mr); + } mw->access = 0; mw->addr = 0; @@ -364,6 +361,50 @@ int rxe_invalidate_mw(struct rxe_qp *qp, struct rxe_mw *mw) return ret; } +int rxe_mw_check_access(struct rxe_qp *qp, struct rxe_mw *mw, + int access, u64 va, u32 resid) +{ + struct rxe_pd *pd = to_rpd(mw->ibmw.pd); + + if (unlikely(mw->state != RXE_MEM_STATE_VALID)) { + pr_err_once("attempt to access a MW that is not valid\n"); + return -EINVAL; + } + + /* C10-76.2.1 */ + if (unlikely((mw->ibmw.type == IB_MW_TYPE_1) && (pd != qp->pd))) { + pr_err_once("attempt to access a type 1 MW with a different PD than the QP\n"); + return -EINVAL; + } + + /* o10-37.2.43 */ + if (unlikely((mw->ibmw.type == IB_MW_TYPE_2) && (mw->qp != qp))) { + pr_err_once("attempt to access a type 2 MW that is associated with a different QP\n"); + return -EINVAL; + } + + /* C10-77 */ + if (unlikely(access && !(access & mw->access))) { + pr_err_once("attempt to access a MW without sufficient access\n"); + return -EINVAL; + } + + if (mw->access & IB_ZERO_BASED) { + if (unlikely((va + resid) > mw->length)) { + pr_err_once("attempt to access a ZB MW out of bounds\n"); + return -EINVAL; + } + } else { + if (unlikely((va < mw->addr) || + ((va + resid) > (mw->addr + mw->length)))) { + pr_err_once("attempt to access a VA MW out of bounds\n"); + return -EINVAL; + } + } + + return 0; +} + void rxe_mw_cleanup(struct rxe_pool_entry *arg) { struct rxe_mw *mw = container_of(arg, typeof(*mw), pelem); diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c index 39ca88030d3a..e0dc79b960fa 100644 --- a/drivers/infiniband/sw/rxe/rxe_req.c +++ b/drivers/infiniband/sw/rxe/rxe_req.c @@ -604,7 +604,6 @@ int rxe_requester(void *arg) if (!mr) { pr_err("No mr for key %#x\n", wqe->wr.ex.invalidate_rkey); - wqe->state = wqe_state_error; wqe->status = IB_WC_MW_BIND_ERR; goto err; } @@ -626,7 +625,6 @@ int rxe_requester(void *arg) case IB_WR_BIND_MW: ret = rxe_bind_mw(qp, wqe); if (ret) { - wqe->state = wqe_state_done; wqe->status = IB_WC_MW_BIND_ERR; goto err; } @@ -636,6 +634,7 @@ int rxe_requester(void *arg) default: pr_err_once("unexpected LOCAL WR opcode = %d\n", wqe->wr.opcode); + wqe->status = IB_WC_LOC_QP_OP_ERR; goto err; } @@ -679,13 +678,7 @@ int rxe_requester(void *arg) payload = (mask & RXE_WRITE_OR_SEND) ? wqe->dma.resid : 0; if (payload > mtu) { if (qp_type(qp) == IB_QPT_UD) { - /* C10-93.1.1: If the total sum of all the buffer lengths specified for a - * UD message exceeds the MTU of the port as returned by QueryHCA, the CI - * shall not emit any packets for this message. Further, the CI shall not - * generate an error due to this condition. - */ - - /* fake a successful UD send */ + /* C10-93.1.1: fake a successful UD send */ wqe->first_psn = qp->req.psn; wqe->last_psn = qp->req.psn; qp->req.psn = (qp->req.psn + 1) & BTH_PSN_MASK; @@ -750,6 +743,8 @@ int rxe_requester(void *arg) * to be called again */ wqe->state = wqe_state_error; + qp->req.wqe_index = next_index(qp->sq.queue, + qp->req.wqe_index); __rxe_do_task(&qp->comp.task); ret = -EAGAIN; goto done; @@ -765,8 +760,7 @@ int rxe_requester(void *arg) again: /* we come here if we are done with the current wqe but want to - * get called again. Mostly we loop back to next wqe so should - * be all one way or the other + * get called again. */ ret = 0; goto done; diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c index 885b5bf6dc2e..136c7699fed3 100644 --- a/drivers/infiniband/sw/rxe/rxe_resp.c +++ b/drivers/infiniband/sw/rxe/rxe_resp.c @@ -391,6 +391,8 @@ static enum resp_states check_rkey(struct rxe_qp *qp, struct rxe_pkt_info *pkt) { struct rxe_mr *mr = NULL; + struct rxe_mw *mw = NULL; + struct rxe_dev *rxe = to_rdev(qp->ibqp.device); u64 va; u32 rkey; u32 resid; @@ -398,6 +400,7 @@ static enum resp_states check_rkey(struct rxe_qp *qp, int mtu = qp->mtu; enum resp_states state; int access; + unsigned long flags; if (pkt->mask & (RXE_READ_MASK | RXE_WRITE_MASK)) { if (pkt->mask & RXE_RETH_MASK) { @@ -405,6 +408,7 @@ static enum resp_states check_rkey(struct rxe_qp *qp, qp->resp.rkey = reth_rkey(pkt); qp->resp.resid = reth_len(pkt); qp->resp.length = reth_len(pkt); + qp->resp.offset = 0; } access = (pkt->mask & RXE_READ_MASK) ? IB_ACCESS_REMOTE_READ : IB_ACCESS_REMOTE_WRITE; @@ -412,6 +416,7 @@ static enum resp_states check_rkey(struct rxe_qp *qp, qp->resp.va = atmeth_va(pkt); qp->resp.rkey = atmeth_rkey(pkt); qp->resp.resid = sizeof(u64); + qp->resp.offset = 0; access = IB_ACCESS_REMOTE_ATOMIC; } else { return RESPST_EXECUTE; @@ -429,20 +434,46 @@ static enum resp_states check_rkey(struct rxe_qp *qp, resid = qp->resp.resid; pktlen = payload_size(pkt); - mr = lookup_mr(qp->pd, access, rkey, lookup_remote); - if (!mr) { - state = RESPST_ERR_RKEY_VIOLATION; - goto err; - } + /* check rkey on each packet because someone could + * have invalidated, deallocated or unregistered it + * since the last packet + */ + if (rkey & IS_MW) { + mw = rxe_pool_get_key(&rxe->mw_pool, &rkey); + if (!mw) { + pr_err_once("no MW found with rkey = 0x%08x\n", rkey); + state = RESPST_ERR_RKEY_VIOLATION; + goto err; + } - if (unlikely(mr->state == RXE_MEM_STATE_FREE)) { - state = RESPST_ERR_RKEY_VIOLATION; - goto err; - } + spin_lock_irqsave(&mw->lock, flags); + if (rxe_mw_check_access(qp, mw, access, va, resid)) { + spin_unlock_irqrestore(&mw->lock, flags); + rxe_drop_ref(mw); + state = RESPST_ERR_RKEY_VIOLATION; + goto err; + } + + mr = mw->mr; + rxe_add_ref(mr); + + if (mw->access & IB_ZERO_BASED) + qp->resp.offset = mw->addr; - if (mr_check_range(mr, va, resid)) { - state = RESPST_ERR_RKEY_VIOLATION; - goto err; + spin_unlock_irqrestore(&mw->lock, flags); + rxe_drop_ref(mw); + } else { + mr = rxe_pool_get_key(&rxe->mr_pool, &rkey); + if (!mr || (mr->rkey != rkey)) { + pr_err_once("no MR found with rkey = 0x%08x\n", rkey); + state = RESPST_ERR_RKEY_VIOLATION; + goto err; + } + + if (rxe_mr_check_access(qp, mr, access, va, resid)) { + state = RESPST_ERR_RKEY_VIOLATION; + goto err; + } } if (pkt->mask & RXE_WRITE_MASK) { @@ -498,8 +529,8 @@ static enum resp_states write_data_in(struct rxe_qp *qp, int err; int data_len = payload_size(pkt); - err = rxe_mr_copy(qp->resp.mr, qp->resp.va, payload_addr(pkt), - data_len, to_mr_obj, NULL); + err = rxe_mr_copy(qp->resp.mr, qp->resp.va + qp->resp.offset, + payload_addr(pkt), data_len, to_mr_obj, NULL); if (err) { rc = RESPST_ERR_RKEY_VIOLATION; goto out; @@ -518,7 +549,6 @@ static DEFINE_SPINLOCK(atomic_ops_lock); static enum resp_states process_atomic(struct rxe_qp *qp, struct rxe_pkt_info *pkt) { - u64 iova = atmeth_va(pkt); u64 *vaddr; enum resp_states ret; struct rxe_mr *mr = qp->resp.mr; @@ -528,7 +558,7 @@ static enum resp_states process_atomic(struct rxe_qp *qp, goto out; } - vaddr = iova_to_vaddr(mr, iova, sizeof(u64)); + vaddr = iova_to_vaddr(mr, qp->resp.va + qp->resp.offset, sizeof(u64)); /* check vaddr is 8 bytes aligned. */ if (!vaddr || (uintptr_t)vaddr & 7) { @@ -653,8 +683,10 @@ static enum resp_states read_reply(struct rxe_qp *qp, res->type = RXE_READ_MASK; res->replay = 0; - res->read.va = qp->resp.va; - res->read.va_org = qp->resp.va; + res->read.va = qp->resp.va + + qp->resp.offset; + res->read.va_org = qp->resp.va + + qp->resp.offset; res->first_psn = req_pkt->psn; @@ -1300,7 +1332,10 @@ int rxe_responder(void *arg) /* Class C */ do_class_ac_error(qp, AETH_NAK_REM_ACC_ERR, IB_WC_REM_ACCESS_ERR); - state = RESPST_COMPLETE; + if (qp->resp.wqe) + state = RESPST_COMPLETE; + else + state = RESPST_ACKNOWLEDGE; } else { qp->resp.drop_msg = 1; if (qp->srq) { @@ -1319,7 +1354,10 @@ int rxe_responder(void *arg) /* Class C */ do_class_ac_error(qp, AETH_NAK_INVALID_REQ, IB_WC_REM_INV_REQ_ERR); - state = RESPST_COMPLETE; + if (qp->resp.wqe) + state = RESPST_COMPLETE; + else + state = RESPST_ACKNOWLEDGE; } else if (qp->srq) { /* UC/UD - class E */ qp->resp.status = IB_WC_REM_INV_REQ_ERR; @@ -1335,7 +1373,10 @@ int rxe_responder(void *arg) /* All, Class A. */ do_class_ac_error(qp, AETH_NAK_REM_OP_ERR, IB_WC_LOC_QP_OP_ERR); - state = RESPST_COMPLETE; + if (qp->resp.wqe) + state = RESPST_COMPLETE; + else + state = RESPST_ACKNOWLEDGE; break; case RESPST_ERR_CQ_OVERFLOW: diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h index 2fb5581edd8a..b24a9a0878c2 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.h +++ b/drivers/infiniband/sw/rxe/rxe_verbs.h @@ -183,6 +183,7 @@ struct rxe_resp_info { /* RDMA read / atomic only */ u64 va; + u64 offset; struct rxe_mr *mr; u32 resid; u32 rkey;